Lie algebras Shiomo Sternberg April 23, 2004  2  Contents 1 The Campbell Baker Hausdorff Formula 7 1.1 The problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 The geometric version of the CBH formula. . . . . . . . . . . . . 8 1.3 The Maurer-Cartan equations. . . . . . . . . . . . . . . . . . . . 11 1.4 Proof of CBH from Maurer-Cartan.. . . . . . . . . . . . . . . . . 14 1.5 The differential of the exponential and its inverse. . . . . . . . . 15 1.6 The averaging method.. . . . . . . . . . . . . . . . . . . . . . . . 16 1.7 The Euler MacLaurin Formula. . . . . . . . . . . . . . . . . . . . 18 1.8 The universal enveloping algebra. . . . . . . . . . . . . . . . . . . 19 1.8.1 Tensor product of vector spaces. . . . . . . . . . . . . . . 20 1.8.2 The tensor product of two algebras. . . . . . . . . . . . . 21 1.8.3 The tensor algebra of a vector space. . . . . . . . . . . . . 21 1.8.4 Construction of the universal enveloping algebra. . . . . . 22 1.8.5 Extension of a Lie algebra homomorphism to its universal enveloping algebra. . . . . . . . . . . . . . . . . . . . . . . 22 1.8.6 Universal enveloping algebra of a direct sum. . . . . . . . 22 1.8.7 Bialgebra structure. . . . . . . . . . . . . . . . . . . . . . 23 1.9 The Poincar6-Birkhoff-Witt Theorem. . . . . . . . . . . . . . . . 24 1.10 Primitives.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.11 Free Lie algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.11.1 Magmas and free magmas on a set . . . . . . . . . . . . . 29 1.11.2 The Free Lie Algebra Lx. . . . . . . . . . . . . . . . . . . 30 1.11.3 The free associative algebra Ass(X). . . . . . . . . . . . . 31 1.12 Algebraic proof of CBH and explicit formulas.. . . . . . . . . . . 32 1.12.1 Abstract version of CBH and its algebraic proof. . . . . . 32 1.12.2 Explicit formula for CBH. . . . . . . . . . . . . . . . . . . 32 2 sl(2) and its Representations. 35 2.1 Low dimensional Lie algebras. . . . . . . . . . . . . . . . . . . . . 35 2.2 sl(2) and its irreducible representations. . . . . . . . . . . . . . . 36 2.3 The Casimir element. . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4 sl(2) is simple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.5 Complete reducibility. . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6 The Weyl group. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3  4 CONTENTS 3 The classical simple algebras. 45 3.1 Graded simplicity. . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 sl(n + l) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3 The orthogonal algebras.. . . . . . . . . . . . . . . . . . . . . . . 48 3.4 The symplectic algebras. . . . . . . . . . . . . . . . . . . . . . . . 50 3.5 The root structures. . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.5.1 An = sl(n +l). . . . . . . . . . . . . . . . . . . . . . . . . 52 3.5.2 Cn = sp(2n),n;> 2. . . . . . . . . . . . . . . . . . . . . . 53 3.5.3 Dn = o(2n), n;> 3. . . . . . . . . . . . . . . . . . . . . . 54 3.5.4 Bn = o(2n +1)ni;>2. . . . . . . . . . . . . . . . . . . . . 55 3.5.5 Diagrammatic presentation. . . . . . . . . . . . . . . . . . 56 3.6 Low dimensional coincidences.. . . . . . . . . . . . . . . . . . . . 56 3.7 Extended diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . 58 4 Engel-Lie-Cartan-Weyl 61 4.1 Engel's theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Solvable Lie algebras. . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 Cartan's criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.5 Radical. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.6 The Killing form. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.7 Complete reducibility. . . . . . . . . . . . . . . . . . . . . . . . . 69 5 Conjugacy of Cartan subalgebras. 73 5.1 Derivations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.2 Cartan subalgebras. . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.3 Solvable case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4 Toral subalgebras and Cartan subalgebras.. . . . . . . . . . . . . 79 5.5 R oots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.6 B ases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.7 Weyl chambers.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.8 Length... . . . . . ................................88 5.9 Conjugacy of Borel subalgebras . . . . . . . . . . . . . . . . . . . 89 6 The simple finite dimensional algebras. 93 6.1 Simple Lie algebras and irreducible root systems. . . . . . . . . . 94 6.2 The maximal root and the minimal root.. . . . . . . . . . . . . . 95 6.3 Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.4 Perron-Frobenius.. . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.5 Classification of the irreducible A. . . . . . . . . . . . . . . . . . 104 6.6 Classification of the irreducible root systems. . . . . . . . . . . . 105 6.7 The classification of the possible simple Lie algebras. .. .. .. .. 109  CONTENTS 5 7 Cyclic highest weight modules. 113 7.1 Verma modules.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.2 When is dim Irr(A) < oc? . . . . . . . . . . . . . . . . . . . . . . 115 7.3 The value of the Casimir. . . . . . . . . . . . . . . . . . . . . . . 117 7.4 The Weyl character formula. . . . . . . . . . . . . . . . . . . . . 121 7.5 The Weyl dimension formula. . . . . . . . . . . . . . . . . . . . . 125 7.6 The Kostant multiplicity formula.. . . . . . . . . . . . . . . . . . 126 7.7 Steinberg's formula. . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.8 The Freudenthal - de Vries formula. . . . . . . . . . . . . . . . . 128 7.9 Fundamental representations. . . . . . . . . . . . . . . . . . . . . 131 7.10 Equal rank subgroups. . . . . . . . . . . . . . . . . . . . . . . . . 133 8 Serre's theorem. 137 8.1 The Serre relations.. . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.2 The first five relations. . . . . . . . . . . . . . . . . . . . . . . . . 138 8.3 Proof of Serre's theorem.. . . . . . . . . . . . . . . . . . . . . . . 142 8.4 The existence of the exceptional root systems.. . . . . . . . . . . 144 9 Clifford algebras and spin representations. 147 9.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . 147 9.1.1 Definition.. . . . . . . . . . . . . . . . . . . . . . . . . . . 147 9.1.2 Gradation. . . . . . . . . . . . . . . . . . . . . . . . . . . 148 9.1.3 Ap as a C(p) module. . . . . . . . . . . . . . . . . . . . . 148 9.1.4 Chevalley's linear identification of C(p) with Ap. . . . . . 148 9.1.5 The canonical antiautomorphism.. . . . . . . . . . . . . . 149 9.1.6 Commutator by an element of p. . . . . . . . . . . . . . . 150 9.1.7 Commutator by an element of A2p. . . . . . . . . . . . . 151 9.2 Orthogonal action of a Lie algebra. . . . . . . . . . . . . . . . . . 153 9.2.1 Expression for v in terms of dual bases. . . . . . . . . . . 153 9.2.2 The adjoint action of a reductive Lie algebra. . . . . . . . 153 9.3 The spin representations. . . . . . . . . . . . . . . . . . . . . . . 154 9.3.1 The even dimensional case. . . . . . . . . . . . . . . . . . 155 9.3.2 The odd dimensional case.. . . . . . . . . . . . . . . . . . 158 9.3.3 Spin ad and V . . . . . . . . . . . . . . . . . . . . . . . . . 159 10 The Kostant Dirac operator 163 10.1 Antisymmetric trilinear forms. . . . . . . . . . . . . . . . . . . . 163 10.2 Jacobi and Clifford. . . . . . . . . . . . . . . . . . . . . . . . . . 164 10.3 Orthogonal extension of a Lie algebra. . . . . . . . . . . . . . . . 165 10.4 The value of [v2 + v(Casr)]o. . . . . . . . . . . . . . . . . . . . . 167 10.5 Kostant's Dirac Operator. . . . . . . . . . . . . . . . . . . . . . . 169 10.6 Eigenvalues of the Dirac operator. . . . . . . . . . . . . . . . . . 172 10.7 The geometric index theorem. . . . . . . . . . . . . . . . . . . . . 178 10.7.1 The index of equivariant Fredholm maps. . . . . . . . . . 178 10.7.2 Induced representations and Bott's theorem. . . . . . . . 179 10.7.3 Landweber's index theorem.... .. .. .. .. .. .. .. .. 180  6 CONTENTS 11 The center of U(g). 183 11.1 The Harish-Chandra isomorphism. . . . . . . . . . . . . . . . . . 183 11.1.1 Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 11.1.2 Example of sl(2). . . . . . . . . . . . . . . . . . . . . . . . 184 11.1.3 Using Verma modules to prove that yH: Z(g) - U(h)w. 185 11.1.4 Outline of proof of bijectivity.. . . . . . . . . . . . . . . . 186 11.1.5 Restriction from S(g*)g to S(h*)w. . . . . . . . . . . . . 187 11.1.6 From S(g)g to S(h)w.. . . . . . . . . . . . . . . . . . . . 188 11.1.7 Completion of the proof.. . . . . . . . . . . . . . . . . . . 188 11.2 Chevalley's theorem. . . . . . . . . . . . . . . . . . . . . . . . . . 189 11.2.1 Transcendence degrees. . . . . . . . . . . . . . . . . . . . 189 11.2.2 Symmetric polynomials. . . . . . . . . . . . . . . . . . . . 190 11.2.3 Fixed fields. . . . . . . . . . . . . . . . . . . . . . . . . . . 192 11.2.4 Invariants of finite groups. . . . . . . . . . . . . . . . . . . 193 11.2.5 The Hilbert basis theorem. . . . . . . . . . . . . . . . . . 195 11.2.6 Proof of Chevalley's theorem. . . . . . . . . . . . . . . . . 196  Chapter 1 The Campbell Baker Hausdorff Formula 1.1 The problem. Recall the power series: exp X =1+X+1-X2+ 1X3 +.---.,log(1+ X) = X-1-X2+1-X3+._ . 2 3! 2 3 We want to study these series in a ring where convergence makes sense; for ex- ample in the ring of n x n matrices. The exponential series converges everywhere, and the series for the logarithm converges in a small enough neighborhood of the origin. Of course, log(exp X) = X; exp(log(1 + X)) = 1 + X where these series converge, or as formal power series. In particular, if A and B are two elements which are close enough to 0 we can study the convergent series log[(exp A)(exp B)] which will yield an element C such that exp C = (exp A) (exp B). The problem is that A and B need not commute. For example, if we retain only the linear and constant terms in the series we find log[(1+ A+"-".-)(1+ B+"-"-)] = log(1+ A+ B +--") =A + B +"--"-. On the other hand, if we go out to terms second order, the non-commutativity begins to enter: 1 1 2 2 7  8 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA A+B+ -A2+AB+-1B2--1A+B+---)2 2 2 2 1 2='A+B+-(A,B]+-- where (A, B] : = AB - BA (1.1) is the commutator of A and B, also known as the Lie bracket of A and B. Collecting the terms of degree three we get, after some computation, 11 1 (A2B+AB2+B2A+BA2-2ABA-2BAB]) 1[A,[A,B]]+1 B[B,A]]. 12 12 12 This suggests that the series for log[(exp A)(exp B)] can be expressed entirely in terms of successive Lie brackets of A and B. This is so, and is the content of the Campbell-Baker-Hausdorff formula. One of the important consequences of the mere existence of this formula is the following. Suppose that g is the Lie algebra of a Lie group G. Then the local structure of G near the identity, i.e. the rule for the product of two elements of G sufficiently closed to the identity is determined by its Lie algebra g. Indeed, the exponential map is locally a diffeomorphism from a neighborhood of the origin in g onto a neighborhood W of the identity, and if U c W is a (possibly smaller) neighborhood of the identity such that U.- U c W, the the product of a = exp ( and b = exp ?7, with a E U and b E U is then completely expressed in terms of successive Lie brackets of ( and 17. We will give two proofs of this important theorem. One will be geometric - the explicit formula for the series for log[(exp A) (exp B)] will involve integration, and so makes sense over the real or complex numbers. We will derive the formula from the "Maurer-Cartan equations" which we will explain in the course of our discussion. Our second version will be more algebraic. It will involve such ideas as the universal enveloping algebra, comultiplication and the Poincard-Birkhoff- Witt theorem. In both proofs, many of the key ideas are at least as important as the theorem itself. 1.2 The geometric version of the CBH formula. To state this formula we introduce some notation. Let ad A denote the operation of bracketing on the left by A, so adA(B) [A,B]. Define the function @ by z log z V)()=z - 1 which is defined as a convergent power series around the point z =1 50 vi 2 3 2 6  1.2. THE GEOMETRIC VERSION OF THE CBH FORMULA. 9 In fact, we will also take this as a definition of the formal power series for @ in terms of u. The Campbell-Baker-Hausdorff formula says that 1 log((exp A)(exp B)) = A + j ) ((exp ad A)(exp tad B)) Bdt. (1.2) 0 Remarks. 1. The formula says that we are to substitute u = (exp ad A)(exp tad B) - 1 into the definition of V), apply this operator to the element B and then integrate. In carrying out this computation we can ignore all terms in the expansion of V) in terms of ad A and ad B where a factor of ad B occurs on the right, since (ad B)B = 0. For example, to obtain the expansion through terms of degree three in the Campbell-Baker-Hausdorff formula, we need only retain quadratic and lower order terms in u, and so 12 u = adA+-(adA)2+tadB+-(adB)2+--- 2 2 u2 = (ad A)2 + t(ad B)(ad A) +- f uIv uv2 \1 1 2 ] 1+ --6)dt = 1+-ad A + 2(ad A)2-12(ad B)(ad A) +..., where the dots indicate either higher order terms or terms with ad B occurring on the right. So up through degree three (1.2) gives 1 1 1 log(expA)(expB) =+A + B-+-[A,B] + [A, [A, B]] [B,[A,B]]+- 2 12 12 agreeing with our preceding computation. 2. The meaning of the exponential function on the left hand side of the Campbell-Baker-Hausdorff formula differs from its meaning on the right. On the right hand side, exponentiation takes place in the algebra of endomorphisms of the ring in question. In fact, we will want to make a fundamental reinter- pretation of the formula. We want to think of A, B, etc. as elements of a Lie algebra, g. Then the exponentiations on the right hand side of (1.2) are still taking place in End(g). On the other hand, if g is the Lie algebra of a Lie group G, then there is an exponential map: exp: g -> G, and this is what is meant by the exponentials on the left of (1.2). This exponential map is a diffeomorphism on some neighborhood of the origin in g, and its inverse, log, is defined in some neighborhood of the identity in G. This is the meaning we will attach to the logarithm occurring on the left in (1.2). 3. The most crucial consequence of the Campbell-Baker-Hausdorff formula is that it shows that the local structure of the Lie group G (the multiplication law for elements near the identity) is completely determined by its Lie algebra. 4. For example, we see from the right hand side of (1.2) that group multi- plication and group inverse are analytic if we use exponential coordinates.  10 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA 5. Consider the function T defined by w T(w) 1 w(1.3) 1 - e--w This is a familiar function from analysis, as it enters into the Euler-Maclaurin formula, see below. (It is the exponential generating function of (-1)kbk where the bk are the Bernoulli numbers.) Then ) T(z) = (logz). 6. The formula is named after three mathematicians, Campbell, Baker, and Hausdorff. But this is a misnomer. Substantially earlier than the works of any of these three, there appeared a paper by Friedrich Schur, "Neue Begruendung der Theorie der endlichen Transformationsgruppen," Mathematische Annalen 35 (1890), 161-197. Schur writes down, as convergent power series, the com- position law for a Lie group in terms of "canonical coordinates", i.e., in terms of linear coordinates on the Lie algebra. He writes down recursive relations for the coefficients, obtaining a version of the formulas we will give below. I am indebted to Prof. Schmid for this reference. Our strategy for the proof of (1.2) will be to prove a differential version of it: dlog((expA)(exp tB)) = (exp ad A)(exp t ad B))B. (1.4) Since log(exp A(exp tB)) = A when t = 0, integrating (1.4) from 0 to 1 will prove (1.2). Let us define F = F(t) = F(t, A, B) by F = log ((expA)(exptB)). (1.5) Then exp F = exp A exp tB and so d d -expF(t) = exp Adexp tB dt dt exp A(exp tB)B (expF(t))B so (exp-F(t)) expF(t) = B. We will prove (1.4) by finding a general expression for where C 0 (t) is a curve in the Lie algebra, g, see (1.11) below.  1.3. THE MAURER-CARTAN EQUATIONS. 11 In our derivation of (1.4) from (1.11) we will make use of an important property of the adjoint representation which we might as well state now: For any g E G, define the linear transformation Ad g : g - g : X - gXg-1. (In geometrical terms, this can be thought of as follows: (The differential of ) Left multiplication by g carries g = TI (G) into the tangent space, Tg(G) to G at the point g. Right multiplication by g-1 carries this tangent space back to g and so the combined operation is a linear map of g into itself which we call Ad g. Notice that Ad is a representation in the sense that Ad (gh) = (Ad g)(Ad h) Vg, h e G. In particular, for any A E g, we have the one parameter family of linear trans- formations Ad(exp tA) and d Ad (exptA)X (exptA)AX(exp -tA) + (exptA)X(-A)(exp -tA) dt (exp tA) [A, X] (exp -tA) so d dAd exptA= Ad(exptA) o ad A. dt p(p ) But ad A is a linear transformation acting on g and the solution to the differ- ential equation M(t) =M(t)ad A, M(O) =I (in the space of linear transformations of g) is exp t ad A. Thus Ad(exp tA) exp(t ad A). Setting t = 1 gives the important formula Ad (exp A) = exp(ad A). (1.6) As an application, consider the F introduced above. We have exp(ad F) = Ad (exp F) Ad ((expA)(exptB)) (Ad exp A)(Ad exp tB) (exp ad A)(exp ad tB) hence ad F= log((exp ad A)(exp ad tB)). (1.7) 1.3 The Maurer-Cartan equations. If G is a Lie group and ' = y(t) is a curve on G with y(0) = A e G, then A-1' is a curve which passes through the identity at t = 0. Hence A-1y'(0) is a tangent vector at the identity, i.e. an element of g, the Lie algebra of C.  12 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA In this way, we have defined a linear differential form 8 on G with values in g. In case G is a subgroup of the group of all invertible n x n matrices (say over the real numbers), we can write this form as 0 = A-1dA. We can then think of the A occurring above as a collection of n2 real valued functions on G (the matrix entries considered as functions on the group) and dA as the matrix of differentials of these functions. The above equation giving 8 is then just matrix multiplication. For simplicity, we will work in this case, although the main theorem, equation (1.8) below, works for any Lie group and is quite standard. The definitions of the groups we are considering amount to constraints on A, and then differentiating these constraints show that A-1dA takes values in g, and gives a description of g. It is best to explain this by examples: " 0(n): AAt =I, dAAt + AdAt =0 or A-1dA + (A-1dA)t = 0. o(n) consists of antisymmetric matrices. " Sp(n): Let J _ 0 I J:=( j~ -I 0 and let Sp(n) consist of all matrices satisfying AJAt J. Then dAJat + AJdAt =0 or (A-dA)J + J(A-dA)t =0. The equation BJ + JBt = 0 defines the Lie algebra sp(n). " Let J be as above and define Gl(n,C) to consist of all invertible matrices satisfying AJ = JA. Then dAJ =JdA=0. and so A-idAJ =A-1Jd A =J A-idA.  1.3. THE MAURER-CARTAN EQUATIONS. 13 We return to general considerations: Let us take the exterior derivative of the defining equation 08= A-1dA. For this we need to compute d(A-1): Since d(AA-1) = 0 we have dA- A-1 + Ad(A-1) = 0 or d(A-1) =_-A-1dA A-1. This is the generalization to matrices of the formula in elementary calculus for the derivative of 1/x. Using this formula we get dO = d(A-1dA) = -(A-1dA - A-1) A dA = -A-1dA A A-1dA or the Maurer-Cartan equation d8+0A0 = 0. (1.8) If we use commutator instead of multiplication we would write this as 1 d 8 + -[ 8, 8]= 0. (1.9) 2' The Maurer-Cartan equation is of central importance in geometry and physics, far more important than the Campbell-Baker-Hausdorff formula itself. Suppose we have a map g : R2 - G, with s, t coordinates on the plane. Pull O back to the plane, so g* =-1g ds + g 9dt as cat Define -1ag a = (s, t) :=g-8 (9s and at so that g*0 = ads +3dt. Then collecting the coefficient of ds A dt in the Maurer Cartan equation gives as 00at+[a,3] =0. (1.10) (9s 9t This is the version of the Maurer Cartan equation we shall use in our proof of the Campbell Baker Hausdorff formula. Of course this version is completely equivalent to the general version, since a two form is determined by its restriction to all two dimensional surfaces.  14 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA 1.4 Proof of CBH from Maurer-Cartan. Let C(t) be a curve in the Lie algebra g and let us apply (1.10) to g(s, t) := exp[sC(t)] so that a(s,t) a=g-1 exp[-sC(t)] exp[sC(t)]C(t) C(t) #s t) = g ' ( ) 9-at a exp[-sC(t)] exp[sC(t)] so by (1.10) SC'(t)+[C(t),3] = 0. For fixed t consider the last equation as the differential equation (in s) d3 -(adC)3+C', (0) =0 ds where C = C(t), C' := C'(t). If we expand 3(s, t) as a formal power series in s (for fixed t): 13(s, t) = a1s + a2s2 + a3s3 +.. . and compare coefficients in the differential equation we obtain a1= C', and nan = -(ad C)an_1 or 1 1 /(s,t) =sC'(t) + -s(-ad C(t))C'(t) +... + - s"(-ad C(t))-1C'(t) + . 2 n! If we define ez1 11 12+. #(z) ._ 1 + z + z2 + -.- - z 2! 3! and set s = 1 in the expression we derived above for 3(s, t) we get d Now to the proof of the Campbell-Baker-Hausdorff formula. Suppose that A and B are chosen sufficiently near the origin so that F =17(t) =F(t, A, B) :=log((exp A)(exp tB))  1.5. THE DIFFERENTIAL OF THE EXPONENTIAL AND ITS INVERSE. 15 is defined for all t <;1. Then, as we remarked, exp F = exp A exp tB so exp ad F = (exp ad A)(exp t ad B) and hence ad F = log ((exp ad A)(exp t ad B)) . We have d d -exp F(t) = exp A-exp tB dt dt = exp A(exp tB)B (expF(t)B so (exp -F(t)) + exp F(t) #(-ad F(t))F'(t) # (- log ((exp ad A)(exp t ad B)))F'(t) Now for Iz- 1| < 1 B and therefore B by (1.11) so B. #(- logz) elog z - 1 - log z z-- - 1 - log z z-1 so z log z z log z (z)#(-tlogz) 1 where z(z) := so F'(t) =@((exp ad A)(exp tad B)) B. This proves (1.4) and integrating from 0 to 1 proves (1.2). 1.5 The differential of the exponential and its inverse. Once again, equation (1.11), which we derived from the Maurer-Cartan equa- tion, is of significant importance in its own right, perhaps more than the use we made of it - to prove the Campbell-Baker-Hausdorff theorem. We will rewrite this equation in terms of more familiar geometric operations, but first some preliminaries: The exponential map exp sends the Lie algebra g into the corresponding Lie group, and is a differentiable map. If E E g we can consider the differential of exp at the point 1: d(exp)g : g= T g - TGexpg  16 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA where we have identified g with its tangent space at ( which is possible since g is a vector space. In other words, d(exp) maps the tangent space to g at the point ( into the tangent space to G at the point exp(). At = 0 we have d(exp)o = id and hence, by the implicit function theorem, d(exp)g is invertible for suffi- ciently small 1. Now the Maurer-Cartan form, evaluated at the point exp ( sends TGexp g back to g: Bexp :TGexp - g. Hence Bexp od(exp) : g - g and is invertible for sufficiently small . We claim that T(ad () o (Oexp od(expg)) = id (1.12) where T is as defined above in (1.3). Indeed, we claim that (1.12) is an immediate consequence of (1.11). Recall the definition (1.3) of the function T as T(z) = 1/#(-z). Multiply both sides of (1.11) by T(ad C(t)) to obtain T(adC(t)) exp(-C(t))+ exp(C(t)) = C'(t). (1.13) Choose the curve C so that = C(0) and 7 = C'(0). Then the chain rule says that d Thus exp(-C(t))+ exp(C(t))) =exp gd(exp)gi}, t=o the result of applying the Maurer-Cartan form 8 (at the point exp(()) to the image of 77 under the differential of exponential map at E E g. Then (1.13) at t = 0 translates into (1.12). QED 1.6 The averaging method. In this section we will give another important application of (1.10): For fixed e E g, the differential of the exponential map is a linear map from g = TT(g) to Texp G. The (differential of) left translation by exp ( carries Texp g(G) back to TeG = g. Let us denote this composite by exp,-1 d(exp). So Oexp od(exp) = d exp-- d(exp): g - g is a linear map. We claim that for any ry E g expg-- d(exp)g () / Adexp(-s)ds. (1.14)  1.6. THE AVERAGING METHOD. 17 We will prove this by applying(1.10) to g (s, t) = exp (t( + sr))) . Indeed, #(s, ) :=g(s, t) ( s so and 3(0, t) The left hand side of (1.14) is a(0, 1) where cast) =gst)-1 Bg so we may use (1.10) to get an ordinary differential equation for a(0, t). Defining Y(t)= a(0,t), (1.10) becomes dt [, (1.15) For any E E g, d - Ad= dt exp te( Adexp_ t[(, ] [Adexp _ t, ]. So for constant c E g, Adexp -te is a solution of the homogeneous equation corresponding to (1.15). So, by Lagrange's method of variation of constants, we look for a solution of (1.15) of the form -y(t) = AdeXp _eg((t) and (1.15) becomes '(t) = Adexp t gj or 'y(t) = Adexp _ 10Adexp gnds is the solution of (1.15) with y(0) = 0. Setting s = 1 gives 1 'y(1) = Adexp j Adexp s gds and replacing s by 1 - s in the integral gives (1.14).  18 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA 1.7 The Euler MacLaurin Formula. We pause to remind the reader of a different role that the T function plays in mathematics. We have seen in (1.12) that T enters into the inverse of the exponential map. In a sense, this formula is taking into account the non- commutativity of the group multiplication, so T is helping to relate the non- commutative to the commutative. But much earlier in mathematical history, T was introduced to relate the discrete to the continuous: Let D denote the differentiation operator in one variable. Then if we think of D as the one dimensional vector field (/(h it generates the one parameter group exp hD which consists of translation by h. In particular, taking h = 1 we have (eDf()=f(x+1 This equation is equally valid in a purely algebraic sense, taking f to be a polynomial and eD =1+D+ 1D2+1D3 . 2 3! This series is infinite. But if p is a polynomial of degree d, then Dkp = 0 for k > D so when applied to any polynomial, the above sum is really finite. Since Dkeah --akeah it follows that if F is any formal power series in one variable, we have F(D)eah = F(a)eah (1.16) in the ring of power series in two variables. Of course, under suitable convergence conditions this is an equality of functions of h. For example, the function T(z) = z/(1 - e-z) converges for z < 27r since +27ri are the closest zeros of the denominator (other than 0) to the origin. Hence d ezh zh r( = ezh 1(1.17) k dh z 1-e-z holds for 0 < z < 27r. Here the infinite order differential operator on the left is regarded as the limit of the finite order differential operators obtained by truncating the power series for T at higher and higher orders. Let a < b be integers. Then for any non-negative values of hi and h2 we have b~2bz az jbIh2ezxdx eh2z e - eh2z e I.z z for z#/0. So ifwe set d d d h1' : d h2'  1.8. THE UNIVERSAL ENVELOPING ALGEBRA. 19 the for 0 < z <2w we have b+h2 bz az T(D1)T(D2) ez'dx T=(z)eh2z e - T(z)e -hiz e a-hiZ because T(D1)f(h2) = f(h2) when applied to any function of h2 since the con- stant term in T is one and all of the differentiations with respect to h1 give zero. Setting hi = h2 = 0 gives b+h2 az bz T(D1)T(D2) ezXdx= + , 0 < z < 27. ia-hi 1_ez 1_-e-z h1=h2=0 On then other hand, the geometric sum gives b e (b-a+l)z S ekz __eaz +z +e2z _+e..k. _(b-a)z) __az1-e k=a eaz ebz 1-ez 1-e-z We have thus proved the following exact Euler-MacLaurin formula: b+h2 b T(D1)T(D2)/ f(x)dx = f(k), (1.18) a -h i h1=h2=0 k=a where the sum on the right is over integer values of k and we have proved this formula for functions f of the form f(x) = ez, 0 < z < 27. It is also true when z = 0 by passing to the limit or by direct evaluation. Repeatedly differentiating (1.18) (with f(x) = ezx) with respect to z gives the corresponding formula with f(x)= x hezx and hence for all functions of the form x H p(x)ezx where p is a polynomial and z < 27. There is a corresponding formula with remainder for Ck functions. 1.8 The universal enveloping algebra. We will now give an alternative (algebraic) version of the Campbell-Baker- Hausdorff theorem. It depends on several notions which are extremely important in their own right, so we pause to develop them. A universal algebra of a Lie algebra L is a map c : L - UL where UL is an associative algebra with unit such that 1. E is a Lie algebra homomorphism, i.e. it is linear and 6[x, y] =6(x)6(y) - 6(y)e(x)  20 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA 2. If A is any associative algebra with unit and a : L - A is any Lie algebra homomorphism then there exists a unique homomorphism # of associative algebras such that aO=#0 . It is clear that if UL exists, it is unique up to a unique isomorphism. So we may then talk of the universal algebra of L. We will call it the universal enveloping algebra and sometimes put in parenthesis, i.e. write U(L). In case L = g is the Lie algebra of left invariant vector fields on a group G, we may think of L as consisting of left invariant first order homogeneous differential operators on G. Then we may take UL to consist of all left invariant differential operators on G. In this case the construction of UL is intuitive and obvious. The ring of differential operators D on any manifold is filtered by degree: Dh consisting of those differential operators with total degree at most n. The quotient, Din/Dn-1 consists of those homogeneous differential operators of degree n, i.e. homogeneous polynomials in the vector fields with function coefficients. For the case of left invariant differential operators on a group, these vector fields may be taken to be left invariant, and the function coefficients to be constant. In other words, (UL)"/(UL)"-l consists of all symmetric polynomial expressions, homogeneous of degree n in L. This is the content of the Poincar6- Birkhoff-Witt theorem. In the algebraic case we have to do some work to get all of this. We first must construct U(L). 1.8.1 Tensor product of vector spaces. Let E1, .. . , Em be vector spaces and (f, F) a multilinear map f : E1 x - .-. x Em F. Similarly (g, G). If £ is a linear map £ : F -> G, and g =L o f then we say that £ is a morphism of (f, F) to (g, G). In this way we make the set of all (f, F) into a category. Want a universal object in this category; that is, an object with a unique morphism into every other object. So want a pair (t, T) where 7 is a vector space, t : E1 x .. -x Em -> 7 is a multilinear map, and for every (f, F) there is a unique linear map £f : T -> F with f=£f at Uniqueness. By the universal property t =£t'ot', t' =£t' ot so t = (toti)ot, but also t = toid. So £' o £t =id. Similarly the other way. Thus (t, T), if it exists, is unique up to a unique morphism. This is a standard argument valid in any category proving the uniqueness of "initial elements". Existence. Let M be the free vector space on the symbols x1, ... , Xm, x2 E E2. Let N be the subspace generated by all the and all the (x1,...,,azi,...,xm) -a(x1,...,xi,...,xm)  1.8. THE UNIVERSAL ENVELOPING ALGEBRA. 21 for all i= 1,...,m, EE, ae k. LetT=M/N and t((zi, . . .., m)) = (xi, . .. , zm)/N. This is universal by its very construction. QED We introduce the notation T=T(Eix...xEm)=:E1®...®Em. The universality implies an isomorphism (E1i®.-®Em)®(Em+1i®.-® Em+n) E1i®.-® Em+n. 1.8.2 The tensor product of two algebras. If A and B are algebras, they are they are vector spaces, so we can form their tensor product as vector spaces. We define a product structure on A ® B by defining (a1 0 bi) . (a2 ® b2) := a1a2 0 bi b2. It is easy to check that this extends to give an algebra structure on A ® B. In case A and B are associative algebras so is A ® B, and if in addition both A and B have unit elements, then lA 0 1B is a unit element for A ® B. We will frequently drop the subscripts on the unit elements, for it is easy to see from the position relative to the tensor product sign the algebra to which the unit belongs. In other words, we will write the unit for A 0 B as 1 0 1. We have an isomorphism of A into A 0 B given by a ->a®l when both A and B are associative algebras with units. Similarly for B. Notice that (a® 1) - (lob) =a b= (lob).- (a® 1). In particular, an element of the form a 0 1 commutes with an element of the form 1 0 b. 1.8.3 The tensor algebra of a vector space. Let V be a vector space. The tensor algebra of a vector space is the solution of the universal problem for maps a of V into an associative algebra: it consists of an algebra TV and a map t : V - TV such that t is linear, and for any linear map a : V - A where A is an associative algebra there exists a unique algebra homomorphism @ : TV - A such that a = a@)o t. We set T"V :=V ®-®V n - factors. We define the multiplication to be the isomorphism T"hV ® TtmV - T n~mV  22 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA obtained by "dropping the parentheses," i.e. the isomorphism given at the end of the last subsection. Then TV :=eT"V (with T°V the ground field) is a solution to this universal problem, and hence the unique solution. 1.8.4 Construction of the universal enveloping algebra. If we take V = L to be a Lie algebra, and let I be the two sided ideal in TL generated the elements [x, y] - x ® y + y ® x then UL := TL/I is a universal algebra for L. Indeed, any homomorphism a of L into an associa- tive algebra A extends to a unique algebra homomorphism @): TL - A which must vanish on I if it is to be a Lie algebra homomorphism. 1.8.5 Extension of a Lie algebra homomorphism to its uni- versal enveloping algebra. If h : L - M is a Lie algebra homomorphism, then the composition EMoh:L-UM induces a homomorphism UL- UM and this assignment sending Lie algebra homomorphisms into associative algebra homomorphisms is functorial. 1.8.6 Universal enveloping algebra of a direct sum. Suppose that: L = L1 e L2, with e6 : Li - U(LZ), and Ec: L - U(L) the canonical homomorphisms. Define f: L - U(L1) 0 U(L2), f (x1 + x2) = 1(x1) 01 + 10 ®2(x2). This is a homomorphism because x1 and x2 commute. It thus extends to a homomorphism V: U(L) - U(L1) 0 U(L2). Also, X1 -e(x1) is a Lie algebra homomorphism of L1 - U(L) which thus extends to a unique algebra homomorphism #1:U(L1) - U(L)  1.8. THE UNIVERSAL ENVELOPING ALGEBRA. 23 and similarly #2 : U(L2) - U(L). We have #1(x1)#2(x2) = #2(x2)#1(x1), xiE L1, x2 E L2 since [x1, x2] = 0. As the e (x) generate U(LZ), the above equation holds with x2 replaced by arbitrary elements u2 E U(LZ), i = 1, 2. So we have a homomorphism # : U(L1) ® U(L2) ' U(L), #(u1 ® u2) := 1(u1)#2(u2). We have #o (x +x2) =5(x1 o1) +0#(1®x2) =x1 +x2 so # o @)= id, on L and hence on U(L) and Vo#z1(x1 ®1+1®x2) =x1 ®1+1®x2 so zo5#= id on L1 ® 1 + 1 ® L2 and hence on U(L1) ® U(L2). Thus U(L1 e L2) e U(L1) ® U(L2). 1.8.7 Bialgebra structure. Consider the map L - U(L) ® U(L): x -x®l1+1®x. Then (x®l+l®x)(y®l+l®y)= xy®1+x®y+y®x++1®xy, and multiplying in the reverse order and subtracting gives [z o 1+ 1 x, y 1+ 1 oy] = [z, y] 1+ 1o[z, y]. Thus the map x H x ® 1 + 1 ® x determines an algebra homomorphism A: U(L) - U(L) 0 U(L). Define E : U(L) - k, e(1) =1, e(x) =0,xoEL and extend as an algebra homomorphism. Then (c®id)(x®1+1®x) 1®x, x e L. We identify k 0 L with L and so can write the above equation as (c oid)(x®o1+1®x ) = x, x E L.  24 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA The algebra homomorphism (c®id)oA: U(L) - U(L) is the identity (on 1 and on) L and hence is the identity. Similarly (id®c) oA =id. A vector space C with a map A: C - C ®0C, (called a comultiplication) and a map E: D - k (called a co-unit) satisfying (E ® id) o A =id and (id®c) oA =id is called a co-algebra. If C is an algebra and both A and E are algebra homo- morphisms, we say that C is a bi-algebra (sometimes shortened to "bigebra"). So we have proved that (U(L), A, E) is a bialgebra. Also [(A®id)oA](x) =x®1®1+1®x®1+1®1®x= [(id®oA)o A](x) for x E L and hence for all elements of U(L). Hence the comultiplication is is coassociative. (It is also co-commutative.) 1.9 The Poincard-B irkhoff-Witt Theorem. Suppose that V is a vector space made into a Lie algebra by declaring that all brackets are zero. Then the ideal I in TV defining U(V) is generated by x ® y - y ® x, and the quotient TV/I is just the symmetric algebra, SV. So the universal enveloping algebra of the trivial Lie algebra is the symmetric algebra. For any Lie algebra L define UnL to be the subspace of UL generated by products of at most n elements of L, i.e. by all products c(x1) ... E(xm), m < n. For example,, UoL = k, the ground field and U1L = k e E(L). We have UoL cU1Lc---c UnL cUn+1L c--- and Um L -Un L C Um~n L.  1.9. THE POINCARE-BIRKHOFF-WITT THEOREM. 25 We define grn UL :=UnL/Un_1L and gr UL := grn UL with the multiplication grm UL x grn UL - grm+nUL induced by the multiplication on UL. If a E UnL we let a E grn UL denote its image by the projection UnL - UnL/Un_1L = grn UL. We may write a as a sum of products of at most n elements of L: m1 xv and then show that it satisfies the equation zyv - yxV = [T, y]v, T, y E L, vE V, (1.21) which is the condition that makes V into an L module. Our definition will be such that (1.20) holds. In fact, we will define TizM inductively on £(M) and on i. So we start by defining XiZ0 = z(i) which is in accordance with (1.20). This defines TizM for £(M) = 0. For £(M) = 1 we define Xjz(4) = z(i,j) if i < while if i > j we set Xiz(j) = Xjz(i) + [xi, xjl]z0 = z(,) + 5 C iz(k)  1.9. THE POINCARE-BIRKHOFF-WITT THEOREM. 27 where [x,x~] ZCXk is the expression for the Lie bracket of xi with xz in terms of our basis. These ck are known as the structure constants of the Lie algebra, L in terms of the given basis. Notice that the first of these two cases is consistent with (and forced on us) by (1.20) while the second is forced on us by (1.21). We now have defined XizM for all i and all M with £(M) < 1, and we have done so in such a way that (1.20) holds, and (1.21) holds where it makes sense (i.e. for £(M) = 0). So suppose that we have defined xjZN for all j if £(N) < £(M) and for all j < i if £(N) =_£(M) in such a way that Xj ZN is a linear combination of ZL's with £(L) < £(N) + 1 (*). We then define ziM if i j. This makes sense since xiZN is already defined as a linear combination of ZL 's with £(L) < £(N) + 1 £= (M) and because [xi, xz] can be written as a linear combination of the Xk as above. Furthermore (*) holds with j and N replaced by M. Furthermore, (1.20) holds by construction. We must check (1.21). By linearity, this means that we must show that XiXjZN - XjXiZN [xi, Xj]ZN. If i = j both sides are zero. Also, since both sides are anti-symmetric in i and j, we may assume that i > j. If j < N and i > j then this equation holds by definition. So we need only deal with the case where j i N which means that N = (kP) with k < P and i > j > k. So we have, by definition, xjZN XjZ(kp) XjXkZP XkXjzP + [xi,xk]zp. Now if j < P then x z z = z(jp) and k < (jP). If j P then xzp z = zQ + w where still k < Q and w is a linear combination of elements of length < £(N). So we know that (1.21) holds for x x=zi, y = xk and v = z(jp) (if j < P) or v = ZQ (otherwise). Also, by induction, we may assume that we have verified (1.21) for all N' of length 2. Thus X2 consists of all expressions ab where a and b are elements of X. (We write ab instead of (a, b).) An element of X3 is either an expression of the form (ab)c or an expression of the form a(bc). An element of X4 has one out of five forms: a((bc)d), a(b(cd)), ((ab)(cd)), ((ab)c)d or (a(bc))d. Set 00 Mx := Xn. n=1 An element w E Mx is called a non-associative word, and its length £(w) is the unique n such that w E Xn. We have a "multiplication" map Mx x Mx given by the inclusion X, pX, Xp+q. Thus the multiplication on Mx is concatenation of non-associative words. If N is any magma, and f : X - N is any map, we define F : Mx - N by F f on X1, by F : X2 - N, F(ab) = f(a)f(b) and inductively F : Xp x Xq -N, F(uv) = F(u)F(v). Any element of X, has a unique expression as uv where u E XP and v E Xq for a unique (p, q) with p + q = n, so this inductive definition is valid. It is clear that F is a magna homomorphism and is uniquely determined by the original map f. Thus Mx is the "free magma on X7" or the "universal  30 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA magma on X" in the sense that it is the solution to the universal problem associated to a map from X to any magma. Let Ax be the vector space of finite formal linear combinations of elements of Mx. So an element of Ax is a finite sum Ecmm with m E Mx and cm in the ground field. The multiplication in Mx extends by bi-linearity to make Ax into an algebra. If we are given a map X - B where B is any algebra, we get a unique magna homomorphism Mx - B extending this map (where we think of B as a magma) and then a unique algebra map Ax - B extending this map by linearity. Notice that the algebra Ax is graded since every element of Mx has a length and the multiplication on Mx is graded. Hence Ax is the free algebra on X in the sense that it solves the universal problem associated with maps of X to algebras. 1.11.2 The Free Lie Algebra LX. In Ax let I be the two-sided ideal generated by all elements of the form aa, a E Ax and (ab)c + (bc)a + (ca)b, a, b, c E Ax. We set Lx := Ax/I and call Lx the free Lie algebra on X. Any map from X to a Lie algebra L extends to a unique algebra homomorphism from Lx to L. We claim that the ideal I defining Lx is graded. This means that if a =Ean is a decomposition of an element of I into its homogeneous components, then each of the an also belong to I. To prove this, let J c I denote the set of all a = E an with the property that all the homogeneous components an belong to I. Clearly J is a two sided ideal. We must show that I c J. For this it is enough to prove the corresponding fact for the generating elements. Clearly if a = ap, b = bq, c = cr then (ab)c + (bc)a + (ca)b = ((apbq)cr + (bqcr)ap + (crap)bq) . p,q,r But also if x = ZXm then t2 Zx2+ Z(XmX + XmXm) m U(Lx) is injective. So under the above isomorphism, the map Lx - Assx is injective. On the other hand, by construction, the map X - Vx induces a surjective Lie algebra homomorphism from Lx into the Lie subalgebra of Assx generated by X. So we see that the under the isomorphism (1.23) Lx C U(Lx) is mapped isomorphically onto the Lie subalgebra of Assx generated by X. Now the map X -Assx ® Assx, x- x®1+ 1® x extends to a unique algebra homomorphism A : Assx - Assx 0 Assx. Under the identification (1.23) this is none other than the map A: U(Lx) - U(Lx) 0 U(Lx) and hence we conclude that Lx is the set of primitive elements of Assx: Lx = {w eAssx A(w) =w & 1+ 1 &w.} (1.24) under the identification (1.23).  32 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA 1.12 Algebraic proof of CBH and explicit for- mulas. We recall our constructs of the past few sections: X denotes a set, Lx the free Lie algebra on X and Assx the free associative algebra on X so that Assx may be identified with the universal enveloping algebra of Lx. Since Assx may be identified with the non-commutative polynomials indexed by X, we may consider its completion, Fx, the algebra of formal power series indexed by X. Since the free Lie algebra Lx is graded we may also consider its completion which we shall denote by Lx. Finally let m denote the ideal in Fx generated by X. The maps exp:m->- 1+m, log: 1+m- >m are well defined by their formal power series and are mutual inverses. (There is no convergence issue since everything is within the realm of formal power series.) Furthermore exp is a bijection of the set of a e m satisfying Aa = a 0®1+10 a to the set of all 3 E 1 + m satisfying A/3 =3 ® /3. 1.12.1 Abstract version of CBH and its algebraic proof. In particular, since the set {/3 E 1 + m A3 /3/3} forms a group, we conclude that for any A, B E Lx there exists a C E Lx such that exp C = (exp A)(exp B). This is the abstract version of the Campbell-Baker-Hausdorff formula. It de- pends basically on two algebraic facts: That the universal enveloping algebra of the free Lie algebra is the free associative algebra, and that the set of primitive elements in the universal enveloping algebra (those satisfying Aa =a1+10a) is precisely the original Lie algebra. 1.12.2 Explicit formula for CBH. Define the map 4 : m n Assx - Lx, 4P(x1 . . . z ) := [x1, [x2,. . ., [xn_, z ] -.-.-I = ad(x1) -.-. ad(xn_1)(xn), and let 8 : Assx - End(Lx) be the algebra homomorphism extending the Lie algebra homomorphism ad : Lx - End(Lx). We claim that (v)=()@(v), Vt E Assx, v E mnAssx. (1.25) Proof. It is enough to prove this formula when vi is a monomial, vi = x -z. We do this by induction on n. For n = 0 the assertion is obvious and for n = 1  1.12. ALGEBRAIC PROOF OF CBH AND EXPLICIT FORMULAS. 33 it follows from the definition of 4. Suppose n > 1. Then ..(x1--. xv) 9(xl)4P(x2 ... xxv) 9(x1)9(x2 ... xn)4P(v) Let LX denote the n-th graded component of Lx. So LX consists of linear combinations of elements of X, LX is spanned by all brackets of pairs of elements of X, and in generalLX is spanned by elements of the form [u,v], u E LX, v E LX, p+q = n. We claim that (u)nu VuELX. (1.26) For n = 1 this is immediate from the definition of 4. So by induction it is enough to verify this on elements of the form [u, v] as above. We have @((u, v]) = (uv-vu) = 8(U)@(v) - 8(v)@(U) q(u)v - p8(v)u by induction =q [u, v] - p [v, u] since 8(w) = ad(w) for w e Lx (p+q)[u,v] QED. We can now write down an explicit formula for the n-th term in the Campbell-Baker-Hausdorff expansion. Consider the case where X consists of two elements X = {x, y}, x / y. Let us write z = log ((exp x)(exp y)) z e Lx, z = zn (x, y). We want an explicit expression for zn (x, y). We know that 1 zn = -@(zn) n  34 CHAPTER 1. THE CAMPBELL BAKER HAUSDORFF FORMULA and z, is a sum of non-commutative monomials of degree n in x and y. Now (exp x) (exp y)II 1 + Ipso p+q21 z = log((exp x)(exp y)) / m 0_(_1)m+1 xPy m >1p'q! m=1 p+q1 (-1)m+1 xPly11xP2 yo2 ... xPm yqm pmqm1 m1q1'''pmqmI We want to apply 1 and (-1)m+1 ad(x)P1ad(y)g1 . . . ad(y)qm-1(x) zm pi"qi!t... q-I summed over p1+-+Pm-1 P-1, q1+---+qm-1=q, pi + qi2 1 (i =1, .. . , m-1) qm-1 2 1. The first four terms are: zi(x,y) = x + y 1 z2(x, y) = [x,y] z (x, y) =12 12 121 z4(x, y) 1 24 [,x~]]  Chapter 2 sl(2) and its Representations. In this chapter (and in most of the succeeding chapters) all Lie algebras and vector spaces are over the complex numbers. 2.1 Low dimensional Lie algebras. Any one dimensional Lie algebra must be commutative, since [X, X] = 0 in any Lie algebra. If g is a two dimensional Lie algebra, say with basis X, Y then [aX + bY, cX + dY] = (ad - bc) [X, Y], so that there are two possibilities: [X, Y] = 0 in which case g is commutative, or [X, Y] # 0, call it B, and the Lie bracket of any two elements of g is a multiple of B. So if C is not a multiple of B, we have [C, B] = cB for some c # 0, and setting A = c-1C we get a basis A, B of g with the bracket relations [A,B] =B. This is an interesting Lie algebra; it is the Lie algebra of the group of all affine transformations of the line, i.e. all transformations of the form xHax+b, a#0. For this reason it is sometimes called the "ax + b group". Since a b X _(ax+ b 0 1 1 1 we can realize the group of affine transformations of the line as a group of two by two matrices. Writing a=exptA, b=tB 35  36 CHAPTER 2. SL(2) AND ITS REPRESENTATIONS. so that a 0 =e A 0 1 b =e 0 B 0 1 =ept 0 0 ' 0 1 =ept 0 0 we see that our algebra g with basis A, B and [A, B] = B is indeed the Lie algebra of the ax + b group. In a similar way, we could list all possible three dimensional Lie algebras, by first classifying them according to dim[g, g] and then analyzing the possibilities for each value of this dimension. Rather than going through all the details, we list the most important examples of each type. If dim[g, g] = 0 the algebra is commutative so there is only one possibility. A very important example arises when dim[g, g] = 1 and that is the Heisen- berg algebra, with basis P, Q, Z and bracket relations P, Q] = Z, [Z, P] = [Z, Q] = 0. Up to constants (such as Planck's constant and i) these are the famous Heisen- berg commutation relations. Indeed, we can realize this algebra as an algebra of operators on functions of one variable x: Let P = D = differentiation, let Q consist of multiplication by x. Since, for any function f = f(x) we have D(xf) = f + xf' we see that [P, Q] = id, so setting Z = id, we obtain the Heisenberg algebra. As an example with dim[g, g] = 2 we have (the complexification of) the Lie algebra of the group of Euclidean motions in the plane. Here we can find a basis h, x, y of g with brackets given by [h, x] = y, [h, y] = -x, [z, y] = 0. More generally we could start with a commutative two dimensional algebra and adjoin an element h with ad h acting as an arbitrary linear transformation, A of our two dimensional space. The item of study of this chapter is the algebra sl(2) of all two by two matrices of trace zero, where [g, g] = g. 2.2 sl(2) and its irreducible representations. Indeed sl(2) is spanned by the matrices: h 1 0 e 0 1 f 0 0 h=0 -1' 0 0' 1 0' They satisfy [h,e] =2e, [h,f] = -2f, [e,f] =h. Thus every element of sl(2) can be expressed as a sum of brackets of elements of sl(2), in other words [sl(2), sl(2)] =sl(2).  2.2. SL(2) AND ITS IRREDUCIBLE REPRESENTATIONS. The bracket relations above are also satisfied by the matrices 2 0 0 0 2 0 0 0 0 P2(h):=0 0 0 , p2(e):= 0 0 1 , P2(f):= 1 0 0 , 0 0 -2 0 0 0 0 2 0 the matrices 37 3 0 P3 (h):=0 0 0 1 0 0 0 0 -1 0 01 0 0 'p3(e) -3,, 0 0 0 0 3 0 0 2 0 0 0 0 0 01 P3(f): 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 o' 0 and, more generally, the (n + 1) x (n + 1) matrices given by /n 0 0 \0 0 n-2 0 0 --- -n+2 0) /0 0 0 \0 n ---" ---20 0 n-1 --- 0 0 --- - - 1 0 "" --- -- 0/ /0 0 ... ... 0\ 1 0 --- 0 Pn (f):=."_. . \0 0 --- n 0/ These representations of sl(2) are all irreducible, as is seen by successively applying pnm(e) to any non-zero vector until a vector with non-zero element in the first position and all other entries zero is obtained. Then keep applying pn(f) to fill up the entire space. These are all the finite dimensional irreducible representations of sl(2) as can be seen as follows: In U(sl(2)) we have [h, ffk] [e,fk] -2k f k, [h, ek] = 2kek -k(k -1i)fk-1 + kfk-lh. (2.1) (2.2) Equation (2.1) follows from the fact that bracketing by any element is a deriva- tion and the fundamental relations in sl(2). Equation (2.2) is proved by induc- tion: For k = 1 it is true from the defining relations of sl(2). Assuming it for k, we have [e, Ik1 [e, fIf k + f [elf k] hfk - k(k -1)fk + kfkh [h,fk] +fkh-k(k -1)fk +kfkh -2kfk - k(k -1)fk +(k +1)fkh -(k + 1)kfk + (k + 1)fkh.  38 CHAPTER 2. SL(2) AND ITS REPRESENTATIONS. We may rewrite (2.2) as e, Ifk] (-k + 1)(+) + (- - lfk-h. (2.3) k! (_ k (k -1)! (k -1)!f In any finite dimensional module V, the element h has at least one eigenvector. This follows from the fundamental theorem of algebra which assert that any polynomial has at least one root; in particular the characteristic polynomial of any linear transformation on a finite dimensional space has a root. So there is a vector w such that hw = pw for some complex number p. Then h(ew) =_[h, e]w + ehw = 2ew + pew= (p + 2)(ew). Thus ew is again an eigenvector of h, this time with eigenvalue p+2. Successively applying e yields a vector vA such that hvA = Av,, evA= 0. (2.4) Then U(sl(2))vA is an invariant subspace, hence all of V. We say that v is a cyclic vector for the action of g on V if U(g)v = V, We are thus led to study all modules for sl(2) with a cyclic vector vA satis- fying (2.4). In any such space the elements 1f k V, k! span, and are eigenspaces of h of weight A - 2k. For any A E C we can construct such a module as follows: Let b+ denote the subalgebra of sl(2) generated by h and e. Then U(b+), the universal enveloping algebra of b+ can be regarded as a subalgebra of U(sl(2)). We can make C into a b+ module, and hence a U(b+) module by h"-1.:= A, e"-1 := 0. Then the space U(sl(2)) ®U(b+) C with e acting on C as 0 and h acting via multiplication by A is a cyclic module with cyclic vector vA = 1 ® 1 which satisfies (2.4). It is a "universal" such module in the sense that any other cyclic module with cyclic vector satisfying (2.4) is a homomorphic image of the one we just constructed. This space U(sl(2)) ®U(b+) C is infinite dimensional. It is irreducible unless there is some Af vA with ehfkvA =0 where k is an integer > 1. Indeed, any non-zero vector w in the space is a finite linear combination of the basis elements A!ffkvA; choose k to be the largest integer so that the coefficient of the corrresponding element does not vanish. Then successive application of the element e (k-times) will yield a multiple of  2.3. THE CASIMIR ELEMENT. 39 VA, and if this multiple is non-zero, then U(sl(2))w = U(sl(2))vA is the whole space. But e (hfkvA) --e (fk)] V= (1-k + A (k 1)!ki This vanishes only if A is an integer and k A + 1, in which case there is a unique finite dimensional quotient of dimension k + 1. QED The finite dimensional irreducible representations having zero as a weight are all odd dimensional and have only even weights. We will call them "even". They are called "integer spin" representations by the physicists. The others are "odd" or "half spin" representations. 2.3 The Casimir element. In U(sl(2)) consider the element C:= h2+ ef+fe (2.5) 2 called the Casimir element or simply the "Casimir" of sl(2). Since ef = fe + [e, f] = fe + h in U(sl(2)) we also can write C =-h2+h+2fe. (2.6) 2 This implies that if v is a "highest weight vector" in a sl(2) module satisfying ev = 0, hoV= Av then 1 CV= -A(A + 2)v. (2.7) 2 Now in U(sl(2)) we have [h,C] = 2([h,f]e+f[h,e]) = 2(-2fe + 2fe) =0 and 1 [C,e]= --2(eh+he)+2e-2he = eh-he+2e -[h,e]+2e = 0. Similarly [C, f] = 0.  40 CHAPTER 2. SL(2) AND ITS REPRESENTATIONS. In other words, C lies in the center of the universal enveloping algebra of sl(2), i.e. it commutes with all elements. If V is a module which possesses a "highest weight vector" vA as above, and if V has the property that vA is a cyclic vector, meaning that V = U(L)vA then C takes on the constant value C=A(A + 2)I 2 since C is central and vA is cyclic. 2.4 sl(2) is simple. An ideal I in a Lie algebra g is a subspace of g which is invariant under the adjoint representation. In other words, I is an ideal if [g, I] c I. If a Lie algebra g has the property that its only ideals are 0 and g itself, and if g is not commutative, we say that g is simple. Let us prove that sl(2) is simple. Since sl(2) is not commutative, we must prove that the only ideals are 0 and sl(2) itself. We do this by introducing some notation which will allow us to generalize the proof in the next chapter. Let g = sl(2) and set g_1 := Cf, go := Ch, gi := Ce so that g, as a vector space, is the direct sum of the three one dimensional spaces g =g-1 e go e gi. Correspondingly, write any x e g as x x= -1 + xo + x1. If we let 1 d :=-h 2 then we have x =-1 +xo+x1, [d,x] -x-1 + 0 + x1, and [d, [d, x]] = x-1 + O + x1. Since the matrix 1 1 1 -1 0 1 1 0 1 is invertible, we see that we can solve for the "components" x_1, xo and x1 in terms of x, [d, x], [d, [d, x]]. This means that if I is an ideal, then I =I1 e Io e1  2.5. COMPLETE REDUCIBILITY. 41 where I_1:=In g_1, Io :=In go, I1:=In g1. Now if Io #0 then d = jh E I, and hence e = [d, e] and f = [d, f] also belong to Iso I =sl(2). If I_1 0 so that f E I, then h =[e,f] e IsoI =sl(2). Similarly, if I1 f 0 so that e E I then h = [e, f] E I so I = sl(2). Thus if I f 0 then I = sl(2) and we have proved that sl(2) is simple. 2.5 Complete reducibility. We will use the Casimir element C to prove that every finite dimensional rep- resentation W of sl(2) is completely reducible, which means that if W' is an invariant subspace there exists a complementary invariant subspace W" so that W = W' e W". Indeed we will prove: Theorem 2 1. Every finite dimensional representation of sl(2) is completely reducible. 2. Each irreducible subspace is a cyclic highest weight module with highest weight n where n is a non-negative integer. 3. When the representation is decomposed into a direct sum of irreducible components, the number of components with even highest weight is the multiplicity of 0 as an an eigenvector of h and 4. the number of components with odd highest weight is the multiplicity of 1 as an eigenvalue of h. Proof. We know that every irreducible finite dimensional representation is a cyclic module with integer highest weight, that those with even highest weight contain 0 as an eigenvalue of h with multiplicity one and do not contain 1 as an eigenvalue of h, and that those with odd highest weight contain 1 as an eigenvalue of h with multiplicity one, and do not contain 0 as an eigenvalue. So 2), 3) and 4) follow from 1). We must prove 1). We first prove Proposition 1 Let 0 -> V -> W -> k -> 0 be an exact sequence of sl(2) modules and such that the action of sl(2) on k is trivial (as it must be, since sl(2) has no non-trivial one dimensional modules). Then this sequence splits, i.e. there is a line in W supplementary to V on which sl(2) acts trivially. This proposition is, of course, a special case of the theorem we want to prove. But we shall see that it is sufficient to prove the theorem. Proof of proposition. It is enough to prove the proposition for the case that V is an irreducible module. Indeed, if V1 is a submodule, then by induction on dim V we may assume the theorem is known for 0 - V/V1 - W/V1 - k -- 0 so that there is a one dimensional invariant subspace M in W/V1 supplementary  42 CHAPTER 2. SL(2) AND ITS REPRESENTATIONS. to V/V1 on which the action is trivial. Let N be the inverse image of M in W. By another application of the proposition, this time to the sequence 0---> V1 --> N---> M---> 0 we find an invariant line, P, in N complementary to V1. So N = V1 e P. Since (W/V1) = (V/V1) e M we must have P n V ={O}. But since dim W = dim V + 1, we must have W = V e P. In other words P is a one dimensional subspace of W which is complementary to V. Next we are reduced to proving the proposition for the case that sl(2) acts faithfully on V. Indeed, let I = the kernel of the action on V. Since sl(2) is simple, either I = sl(2) or I = 0. Suppose that I = sl(2). For all x e sl(2) we have, by hypothesis, xW c V, and for x E I = sl(2) we have xV = 0. Hence [sl(2),sl(2)] = sl(2) acts trivially on all of W and the proposition is obvious. So we are reduced to the case that V is irreducible and the action, p, of sl(2) on V is injective. We have our Casimir element C whose image in End W must map W -> V since every element of sl(2) does. On the other hand, C = }n(n + 2) Id#0 since we are assuming that the action of sl(2) on the irreducible module V is not trivial. In particular, the restriction of C to V is an isomorphism. Hence ker C, : W - V is an invariant line supplementary to V. We have proved the proposition. Proof of theorem from proposition. Let 0 - E' - E be an exact sequence of sl(2) modules, and we may assume that E' f 0. We want to find an invariant complement to E' in E. Define W to be the subspace of Homk (E, E') whose restriction to E' is a scalar times the identity, and let V c W be the subspace consisting of those linear transformations whose restrictions to E' is zero. Each of these is a submodule of End(E). We get a sequence 0 ->V ->W ->k ->0 and hence a complementary line of invariant elements in W. In particular, we can find an element, T which is invariant, maps E - E', and whose restriction to E' is non-zero. Then ker T is an invariant complementary subspace. QED 2.6 The Weyl group. We have exp e=( 0 1 and exp -f =(1 ) so 1 1 1 0 11 0 1 (expe)(exp-f)(expe) 0 1 -1 1 0 1 -))1 0 Since  2.6. THE WEYL GROUP. 43 we see that T (exp ad e)(exp ad(-f))(exp ad e) consists of conjugation by the matrix 0 1 -1 0 ) Thus 0 1 1 0 0 -1 -1 0 r(h) =-1 0 0 -1 1 0 0 1 h, T 0V( 1) 0 1 0 -10 0) r(e)= -10 0 0 0 1 0 -1 0 0 - and similarly T(f) =-e. In short T: e - -f, f - -e, h - -h. In particular, T induces the "reflection" h H -h on Ch and hence the reflection p > -p (which we shall also denote by s) on the (one dimensional) dual space. In any finite dimensional module V of sl(2) the action of the element T (exp e)(exp -f)(exp e) is defined, and (r)-h() = Ad (T-1) (h) = s-1h = sh so if hu = pu then h(rU) = r(r)-h(r)u = rs(h)u = -pra~ = (sp)ru. So if V :{uE V hu =pu} then r(V) = Ve. (2.8) The two element group consisting of the identity and the element s (acting as a reflection as above) is called the Weyl group of sl(2). Its generalization to an arbitrary simple Lie algebra, together with the generalization of formula (2.8) will play a key role in what follows.  44 CHAPTER 2. SL(2) AND ITS REPRESENTATIONS.  Chapter 3 The classical simple algebras. In this chapter we introduce the "classical" finite dimensional simple Lie al- gebras, which come in four families: the algebras sl (n + 1) consisting of all traceless (n + 1) x (n + 1) matrices, the orthogonal algebras, on even and odd dimensional spaces (the structure for the even and odd cases are different) and the symplectic algebras (whose definition we will give below). We will prove that they are indeed simple by a uniform method - the method that we used in the preceding chapter to prove that sl(2) is simple. So we axiomatize this method. 3.1 Graded simplicity. We introduce the following conditions on the Lie algebra g: g =g2 (3.1) [g2,g3] c gi+j (3.2) [gi,g-i] = go (3.3) [g _1, z] = - z= 0, V zE g2, Vi >0 (3.4) There exists a d e go satisfying [d, x] = kx, x E gk, Vk, (3.5) and g_1 is irreducible under the (adjoint) action of go. (3.6) Condition (3.4) means that if x E gj,i ;> 0 is such that [y,x] = 0 for all y E g-1 then x =0. We wish to show that any non-zero g satisfying these six conditions is simple. We know that g_1, go and g1 are all non-zero, since 0 f d e go by (3.5) and 45  46 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. [g-1, gl] = go by (3.3). So g can not be the one dimensional commutative algebra, and hence what we must show is that any non-zero ideal I of g must be all of g. We first show that any ideal I must be a graded ideal, i.e. that I =I_1 eIoeI1 e(- -- , where I := Ingj. Indeed, write any x E g as x =x_1 +xo +xl +""" -+Xk and successively bracket by d to obtain x = x_1 +zo + z1+"-""-+Xz [d,x] -x-1 +O+x1 +--+kk [d, [d, x]] = x_1 + O+ x1 + -.. + k2xk (add)kx= (-1)kxzl + 0 + x1 + ... + kkXk (ad d)k+lx = (l)k+1x_1 + 0 + x1 + -.. + kk+1xk. The matrix / 1 1 1 -"-"- 1 \ -1 0 1 -.-.- k (-1)k 0 1 -..- kk (-1)k+1 0 1 -.-. kk+1 is non singular. Indeed, it is a van der Monde matrix, that is a matrix of the form / 1 1 1 --- 1 \ t1 t2 1 ... tk+2 t1 t2 1 ... tk±2 tk+1 tk+1 i ... k+1 \t 2 -- k+2/ whose determinant is Hlti-tj) i 0. Suppose 0 / y E go. Since every element of [I1, y] belongs to I and to g_1 we conclude  3.2. SL(N+ 1) 47 that [gi,y] = 0 and hence that y = 0 by (3.4). Thus Io = 0. Suppose that we know that I_1 = 0. Then the same argument shows that any y E Ii satisfies [gi, y] = 0 and hence y = 0. So Ij = 0 for all j, and since I is the sum of all the Ij we conclude that I = 0. Now suppose that I_1 = g_1. Then go = [g-i, g] [Ii, g1] c I. Fur- thermore, since d E go C I we conclude that g c I for all k / 0 since every element y of such a gk can be written as y = [d, y] E I. Hence I = g. QED For example, the Lie algebra of all polynomial vector fields, where 9k = { X ( X2 homogenous polynomials of degree k + 1} axi is a simple Lie algebra. Here d is the Euler vector field a a d =x1 +C- - - +Xrzi . This algebra is infinite dimensional. We are primarily interested in the finite dimensional Lie algebras. 3.2 sl(n--1) Write the most general matrix in sl(n +1) as -trA w* V A where A is an arbitrary n x n matrix, vis a column vector and w* =(w1,. . . , wn) is a row vector. Let g_1 consist of matrices with just the top row, i.e. with v = A = 0. Let g1 consist of matrices with just the left column, i.e. with A = w* = 0. Let go consist of matrices with just the central block, i.e. with v~w*=0. Let d= 1 -n 0 n+1 0 I where I is the n x n identity matrix. Thus go acts on g_1 as the algebra of all endomorphisms, and so g-1 is irreducible. We have 0 0 0 w* -(w*, v) 0 [v 0 ' 0 0 ] 0 vow* ' where (w*, v) denotes the value of the linear function w* on the vector v, and this is precisely the trace of the rank one linear transformation v 6 w*<. Thus all our axioms are satisfied. The algebra sl~rn + 1) is simple.  48 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. 3.3 The orthogonal algebras. The algebra o(2) is one dimensional and (hence) commutative. In our (real) Euclidean three dimensional space, the algebra o(3) has a basis X, Y, Z (in- finitesimal rotations about each of the axes) with bracket relations (X, Y] = Z, (Y, Z] = X, (Z, X] = Y, (the usual formulae for "vector product" in three dimensions". But we are over the complex numbers, so can consider the basis X + iY, -X + iY, iZ and find that [iZ,X+iY] = X+iY, [iZ, -X +iY] -(-X +iY), [X+iY, -X +iY] = 2iZ. These are the bracket relations for sl(2) with e = X + iY, f = -X + iY, h iZ. In other words, the complexification of our three dimensional world is the irreducible three dimensional representation of sl(2) so o(3) = sl(2) which is simple. To study the higher dimensional orthogonal algebras it is useful to make two remarks: If V is a vector space with a non-degenerate symmetric bilinear form ( , we get an isomorphism of V with its dual space V * sending every u E V to the linear function £u where £u(v) (v, u). This gives an identification of End(V)=V®V* with V®V. Under this identification, the elements of o(V) become identified with the anti- symmetric two tensors, that is with elements of A2(V). (In terms of an or- thonormal basis, a matrix A belongs to o(V) if and only if it is anti-symmetric.) Explicitly, an element u Av becomes identified with the linear transformation AnAv where AUnAvX= (x, v)u - (u, x)v. This has the following consequence. Suppose that z E V with (z, z) / 0, and let w be any element of V. Then AWAzz = (z, z)w - (z, w)z and so U(o(V))z = V. On the other hand, suppose that u E V with (u, u) = 0. We can find v E V with (v, v) = 0 and (v, u) = 1. Now suppose in addition that dim V > 3. We can then find a z E V orthogonal to the plane spanned by u and v and with (z, z) = 1. Then AzAvu =z, so z E U(o(V))e and hence U(o(V))u = V. We have proved: 1 If dim V > 3 then every non-zero vector in V is cyclic, i.e the representa- tiom of o(V ) om V is irreducible.  3.3. THE ORTHOGONAL ALGEBRAS. 49 (In two dimensions this is false - the line spanned by a vector e with (e, e) = 0 is a one dimensional invariant subspace.) We now show that 2 o(V) is simple for dim V > 5. For this, begin by writing down the bracket relations for elements of o(V) in terms of their parametrization by elements of A2V. Direct computation shows that [AuAv, AxAy]1= (v,x)AAy - (u,x)AVAy - (v,y)AAx + (u, y)AVAr. (3.7) Now let n = dim V - 2 and choose a basis u, v, x1, ..., ze of V where (u, u) = (u, zi) =(v,v) = (v,xi) = 0 Vi, (u, v) = 1, (x,x) = o. Let g := o(V) and write W for the subspace spanned by the xi. Set d :=AAv and g-1 := {AVAX, x E W}, go := o(W) e Cd, gi := {AUAX, X E W}. It then follows from (3.7) that d satisfies (3.5). The spaces g_1 and gi look like copies of W with the o(W) part of go acting as o(W), hence irreducibly since dim W > 3. All our remaining axioms are easily verified. Hence o(V) is simple for dim V > 5. We have seen that o(3) = sl(2) is simple. However o(4) is not simple, being isomorphic to sl(2) e sl(2): Indeed, if Z1 and Z2 are vector spaces equipped with non-degenerate anti-symmetric bilinear forms ( , ) 1 and ( , ) 2 then Z1 0 Z2 has a non-degenerate symmetric bilinear form (, ) determined by (u1 ®ut2,v ®v2) = (i,v1)1(u2,v2)2. The algebra sl(2) acting on its basic two dimensional representation infinitesi- mally preserves the antisymmetric form given by (i ( =X1y2 - 2 1. x2 ' 2 Hence, if we take Z = Z1 = Z2 to be this two dimensional space, we see that sl(2) e sl(2) acts as infinitesimal orthogonal transformations on Z ® Z which is four dimensional. But o(4) is six dimensional so the embedding of sl(2) e sl(2) in o(4) is in fact an isomorphism since 3 + 3 =6.  50 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. 3.4 The symplectic algebras. We consider an even dimensional space with coordinates q1, q2,... , P1,P2,.... The polynomials have a Poisson bracket Of og {f, g} := - 92 X Of ag (qi pi (3.8) This is clearly anti-symmetric, and direct computation will show that the Ja- cobi identity is satisfied. Here is a more interesting proof of Jacobi's identity: Notice that if f is a constant, then {f, g} = 0 for all g. So in doing bracket computations we can ignore constants. On the other hand, if we take g to be successively q1,... , qn, Pi, ... , pn in (3.8) we see that the partial derivatives of f are completely determined by how it brackets with all g, in fact with all linear g. If we fix f, the map hF->{f,h} is a derivation, i.e. it is linear and satisfies {f, hih2} {f, h1}h2 + h1{f, h2}. This follows immediately from from the definition (3.8). Now Jacobi's identity amounts to the assertion that {{f, g}, h} = {f, {g, h}} - {g, {f, h}}, i.e. that the derivation h - {{f,g},h} is the commutator of the of the derivations hF->{f, h} and hF->{g, h}. It is enough to check this on linear polynomials h, and hence on the polynomials qj and Pk. If we take h = qj then {f,qy}=-,O opg ag {g,qj} =p so {f, {g, qj}} {f, {f,qj}} {f,{g, qj}} - {g,{f, qj}} Of 02g &p2 (qiopj E g &2 f Opi qiopj a {f, g} {{f, g}, og } Of 2g) &qi &piopj ag 2f) 0q2 Ot 9pjso as desired, with a similar computation for Pk.  3.4. THE SYMPLECTIC ALGEBRAS. 51 The symplectic algebra sp(2n) is defined to be the subalgebra consisting of all homogeneous quadratic polynomials. We divide these polynomials into three groups as follows: Let g1 consist of homogeneous polynomials in the q's alone, so g1 is spanned by the qiqj. Let g_1 be the quadratic polynomials in the p's alone, and let go be the mixed terms, so spanned by the qipj. It is easy to see that gor~ gl(n) and that [gi, gl] = go. To check that g_1 is irreducible under go, observe that [p1qj,pkp] = 0 if j f k or £, and [piq3,pspr] is a multiple of p1pf. So we can by one or two brackets carry any non-zero element of g-1 into a non-zero multiply of pl, and then get any monomial from pl by bracketing with piqi appropriately. The element d is given by } (P1 qi + - - - + p,,gq,). We have shown that the symplectic algebra is simple, but we haven't really explained what it is. Consider the space of V of homogenous linear polynomials, i.e all polynomials of the form f = alq1+.-.-.+angn +blpq +.-.-.+ bpn. Define an anti-symmetric bilinear form w on V by setting w (X,, .') :=_ {, '). From the formula (3.8) it follows that the Poisson bracket of two linear functions is a constant, so w does indeed define an antisymmetric bilinear form on V, and we know that this bilinear form is non-degenerate. Furthermore, if f is a homogenous quadratic polynomial, and £ is linear, then {f, £} is again linear, and if we denote the map .ems {f, f by A = Af, then Jacobi's identity translates into w(A£, £') + w(fA£') = 0 (3.9) since {f, £'} is a constant. Condition (3.9) can be interpreted as saying that A belongs to the Lie algebra of the group of all linear transformations R on V which preserve w, i.e. which satisfy w(R£, Re') = w(X, £'). This group is known as the symplectic group. The form w induces an isomor- phism of V with V * and hence of Hom(V, V) = V 0 V * with V ® V, and this time the image of the set of A satisfying (3.9) consists of all symmetric ten- sors of degree two, i.e. of S2(V). (Just as in the orthogonal case we got the anti-symmetric tensors). But the space S2(V) is the same as the space of ho- mogenous polynomials of degree two. In other words, the symplectic algebra as defined above is the same as the Lie algebra of the symplectic group. It is an easy theorem in linear algebra, that if V is a vector space which carries a non-degenerate anti-symmetric bilinear form, then V must be even dimensional, and if dim V= 2n then it is isomorphic to the space constructed above. We will not pause to prove this theorem.  52 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. 3.5 The root structures. We are going to choose a basis for each of the classical simple algebras which generalizes the basis e, f, h that we chose for sl(2). Indeed, for each classical simple algebra g we will first choose a maximal commutative subalgebra h all of whose elements are semi-simple = diagonizable in the adjoint representation. Since the adjoint action of all the elements of h commute, this means that they can be simultaneously diagonalized. Thus we can decompose g into a direct sum of simultaneous eigenspaces g= h e g,(3.10) where 0 a E h* and ga := {x E g|[h,x] a(h)x V he h}. The linear functions a are called roots (originally because the a(h) are roots of the characteristic polynomial of ad(h)). The simultaneous eigenspace go is called the root space corresponding to a. The collection of all roots will usually be denoted by 2. Let h consist of all linear combinations of p1qi,... , pq, and let Li be defined by Li (aipiqi + -,, + anpnqn) = a so L1,... , Ln is the basis of h* dual to the basis piqi,... ,pmqn of h. If h alpiql + -"" + anpnq, then (h, qig3] =(ai + a)qiq3 (h, qip3] =(ai - a)qip3 (h, pip3] =-(a2 + a)pip3  54 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. so the roots are +(Li+Lj) all i,j and Li -Lj i j. We can divide the roots 3. We choose a basis u1,..., n, v1,... , vn of our orthogonal vector space V such that (uiI aj) = (vi, og ) = 0, V i,1, (UiIog ) =-Sig. We let h be the subalgebra of o(V) spanned by the Augv, i = 1,..., n. Here we have written AY instead of AA in order to save space. We take Auses, .. ., Aunse as a basis of h and let L1,.. ., Ln be the dual basis. Then +Lk + L k # £ are the roots since from (3.7) we have (Augvs, Auku,]j - (Sik + if)Aukue [Auivi ,Aukve]- (6ik - i)Aukve ( Auivis, Avve, ] Q--- ik +| Sig)Avkve.  3.5. THE ROOT STRUCTURES. 55 We can choose as positive roots the Lk-+L , Lk- Lf, k < f and set asi := Li - Li+1, 2 = 1, - - - , n - 1, a n := Ln-1 + La. Every positive root is a sum of these simple roots. If we set hi := Ausv2 - Augv+1,i = 1, ...n - 1, and h = AUn_1sv.1 + Aunv then ai (hi) = 2 and for i j a2(hy) = 0 j#it1, i=1,...n-2 ai(hi±i) = -1 i=1,...,n -2 an_1(hn-2) = -1 (3.13) aln(hn-2) = -1 an(hn_1) = 0. For i = 1, ... , n - 1 the elements hi, A +1'2+ , Auge form a subalgebra isomor- phic to sl(2) as do hn, Aun-_un, Av_1vn. 3.5.4 B =o(2n+1)n>2. We choose a basis u1, ... , u, v1,,... , vn, x of our orthogonal vector space V such that (uiI aj) = (vi, og ) = 0, V ,1, (UiIog ) = oig and (x,ui)= (x, vi) = 0 V i, (x,x) = 1. As in the even dimensional case we let h be the subalgebra of o(V) spanned by the Augv2, i = 1, . .. , n and take Au,,,, .. . , Aun~v as a basis of h and let L1, ..., Ln be the dual basis. Then +Li+L i fj,+Li are roots. We take  56 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. to be the positive roots, and ai := Li - Li+1ll =1 --- n - 1, a n := Ln to be the simple roots. We let hi := Auiv2 - Aug+2+,,i = 1, ... n - 1, as in the even case, but set hn := 2Aun v. Then every positive root can be written as a sum of the simple roots, ac(hi) =2, 1= 1, ... n, and for i j a2(h3) = 0 j#it1, i1,...n ayi(hi+1) = -1 i =1,...,n - 2,n (3.14) azn_1(hn) =-2 Notice that in this case an_1 + 2an = Ln_1 + Ln is a root. Finally we can construct subalgebras isomorphic to sl(2), with the first n - 1 as in the even orthogonal case and the last sl(2) spanned by hn, Aun, -Avn" 3.5.5 Diagrammatic presentation. The information of the last four subsections can be summarized in each of the following four diagrams: The way to read this diagram is as follows: each node in the diagram stands for a simple root, reading from left to right, starting with a1 at the left. (In the diagram Df the two rightmost nodes are a _1 and aq, say the top a _1 and the bottom ar.) Two nodes a, and a3 are connected by (one or more) edges if and only if ai(h3) # 0. In all cases, the difference, ai - as is never a root, and, for i j, a2(h) < 0 and is an integer. If, for i j, a2(h3) < 0 then ai + a3 is a root. In two of the cases (Bf and C) it happens that a2(h3) = -2. Then ai + a3 and ai + 2a3 are roots, and we draw a double bond with an arrow pointing towards ag. In this case 2 is is the maximum integer such that ai + ka3 is a root. In all other cases, this maximum integer k is one if the nodes are connected (and zero it they are not). 3.6 Low dimensional coincidences. We have already seen that o(4) ~ sl(2) e sl(2). We also have o (6) ~rs(4)  3.6. LOW DIMENSIONAL COINCIDENCES. 57 -. ....___ 0£> 0----...... D £ ;> 4 Figure 3.1: Dynkin diagrams of the classical simple algebras. Both algebras are fifteen dimensional and both are simple. So to realize this isomorphism we need only find an orthogonal representation of sl(4) on a six dimensional space. If we let V = C4 with the standard representation of sl(4), we get a representation of sl(4) on A2(V) which is six dimensional. So we must describe a non-degenerate bilinear form on A2 V which is invariant under the action of sl(4). We have a map, wedge product, of A2V X A2V-' A4 V. Furthermore this map is symmetric, and invariant under the action of gl(4). However sl(4) preserves a basis (a non-zero element) of A4V and so we may identify A4V with C. It is easy to check that the bilinear form so obtained is non-degenerate We also have the identification sp(4) ~No(5) both algebras being ten dimensional. To see this let V = C4 with an antisym- metric form w preserved by Sp(4). Then w ® w induces a symmetric bilinear form on V ® V as we have seen. Sitting inside V ® V as an invariant subspace is A2V as we have seen, which is six dimensional. But A2V is not irreducible as a representation of sp(4). Indeed, w E A2V* is invariant, and hence its kernel is a five dimensional subspace of A2V which is invariant under sp(4). We thus get a non-zero homomorphism sp(4) - o(5) which must be an isomorphism since sp(4) is simple.  58 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS. These coincidences can be seen in the diagrams. If we were to allow £ = 2 in the diagram for Bf it would be indistinguishable from C2. If we were to allow = 3 in the diagram for De it would be indistinguishable from A3. 3.7 Extended diagrams. It follows from Jacobi's identity that in the decomposition (3.10), we have [g,g&] c ga+a' (3.15) with the understanding that the right hand side is zero if a + a' is not a root. In each of the cases examined above, every positive root is a linear combination of the simple roots with non-negative integer coefficients. Since the algebra is finite, there must be a maximal positive root 3 in the sense that /3+ a2 is not a root for any simple root. For example, in the case of An = sl(n + 1), the root 3 := L1- Ln+1 is maximal. The corresponding go consists of all (r+1) x (r+1) matrices with zeros everywhere except in the upper right hand corner. We can also consider the minimal root which is the negative of the maximal root, so ao : -13 = Ln+1 - L1 in the case of An. Continuing to study this case, let ho= hn+1 - hi. Then we have a (h2)=2, i=0,...n and a~o(hi) = azo(hn) = -1, ao(h) =0, i + 0, 1, n. This means that if we write out the (n + 1) x (n + 1) matrix whose entries are a (hy), i, j = 0, ... n we obtain a matrix of the form 21 - M where M3I = 1 if and only if j =+1 with the understanding that n +1 = 0 and -1 = n, i.e we do the subscript arithmetic mod n. In other words, M is the adjacency matrix of the cyclic graph with n + 1 vertices labeled 0,... n. Also, we have ho + h1+ ... + h=0. If we apply a2 to this equation for i= 0,... n we obtain (21- M)1 0, where 1 is the column vector all of whose entries are 1. We can write this equation as M1i 21.  3.7. EXTENDED DIAGRAMS. 59 In other words, 1 is an eigenvector of M with eigenvalue 2. In the chapters that follow we shall see that any finite dimensional simple Lie algebra has roots, simple roots, maximal roots etc. giving rise to a matrix M with integer entries which is irreducible (in the sense of non-negative matrices - definition later on) and which has an eigenvector with positive (integer) entries with eigenvalue 2. This will allow us to classify the simple (finite dimensional) Lie algebras.  60 CHAPTER 3. THE CLASSICAL SIMPLE ALGEBRAS.  Chapter 4 Engel-Lie- Cartan-Weyl We return to the general theory of Lie algebras. Many of the results in this chapter are valid over arbitrary fields, indeed if we use the axioms to define a Lie algebra over a ring many of the results are valid in this generality. But some of the results depend heavily on the ring being an algebraically closed field of characteristic zero. As a compromise, throughout this chapter we deal with fields, and will assume that all vector spaces and all Lie algebras which appear are finite dimensional. We will indicate the necessary additional assumptions on the ground field as they occur. The treatment here follows Serre pretty closely. 4.1 Engel's theorem Define a Lie algebra g to be nilpotent if: Inl [zi, [2,... z +1] - - ] = 0 V x1, ... , zn+1 E g. Example: n+ := n+(gl(d)) := all strictly upper triangular matrices. Notice that the product of any d + 1 such matrices is zero. The claim is that all nilpotent Lie algebras are essentially like n+. We can reformulate the definition of nilpotent as saying that the product of any n operators ad x2 vanishes. One version of Engel's theorem is Theorem 3 g is nilpotent if and only if ad x is a nilpotent operator for each x E g. This follows (taking V = g and the adjoint representation) from Theorem 4 Engel Let p : g - End(V) be a representation such that p(x) is nilpotent for each x E g. Then there exists a basis in terms of which p(g) c n+(gl(d)), i.e. becomes strictly upper triangular. Here d =dim V. Given a single nilpotent operator, we can always find a non-zero vector, v which it sends into zero. Then on V/{v} a non-zero vector which the induced 61  62 CHAPTER 4. ENGEL-LIE-CARTAN-WEYL map sends into zero etc. So in terms of such a flag, the corresponding matrix is strictly upper triangular. The theorem asserts that we are can find a single flag which works for all p(x). In view of the above proof for a single operator, Engel's theorem follows from the following simpler looking statement: Theorem 5 Under the hypotheses of Engel's theorem, if V f 0, there exists a non-zero vector v E V such that p(x)v = 0 Vx E g. Proof of Theorem 5 in seven easy steps. " Replace g by its image, i.e. assume that g c End V. " Then (ad x)y = Lxy - R y where Lx is the linear map of End V into itself given by left multiplication by x, and Rx is given by right multiplication by x. Both Lx and Rx are nilpotent as operators since x is nilpotent. Also they commute. Hence by the binomial formula (ad x)" = (Lx - Rx)" vanishes for sufficiently large n. " We may assume (by induction) that for any Lie algebra, m, of smaller dimension than that of g (and any representation) there exists a v E V such that xv= 0 Vx E m. " Let k c g be a subalgebra, k # g, and let N =N(k) := {x e g (adx)k C k} be its normalizer. The claim is that 3 N(k) is strictly larger than k. To see this, observe that each x E k acts on k and on g/k by nilpotent maps, and hence there is an 0 Qe g/k killed by all x E k. But then y ak, and [y,x] -[x,y] e k for all x E k. Soy e N(k), y g k. * If g # 0, there is an ideal i c g such that dimg/i = 1. Indeed, let i be a maximal proper subalgebra of g. Its normalizer is strictly larger, hence all of g, so i is an ideal. The inverse image in g of a line in g/i is a subalgebra, and is strictly larger than i. Hence it must be all of g. " Choose such an ideal, i. The subspace WcV, W={vxv=0, VxEi} is invariant under g. Indeed, if y E g, w E W then xyw= yxw + [x, y]w = 0. " W f 0 by induction. Take y E g, y i. It preserves W and is nilpotent. Hence there is a non-zero v E W with yv= 0. Since y and i span g, we havexv=0 VxEg. QED No assumptions about the ground field went into this.  4.2. SOLVABLE LIE ALGEBRAS. 63 4.2 Solvable Lie algebras. Let g be a Lie algebra. Dng is defined inductively by D0g : g, D1(g) := [g, g],...., Dn+1g :=[Dug, Dng]. If we take b to consist of all upper triangular n x n matrices, then D1b = n+ consists of all strictly triangular matrices and then successive brackets eventually lead to zero. We claim that the following conditions are equivalent and any Lie algebra satisfying them is called solvable. 1. In Dng = 0. 2. In such that for every family of 2n elements of g the successive brackets of brackets vanish; e.g for n = 4 this says [[[[x1, x2], [x3, x4]], [[x5, z6], [x7, x8]]], [[[x9, zio], [x11, x12]], [[x13.x14], [xi5, zi6]]]]= 0. 3. There exists a sequence of subspaces g := i1 D i2 D ... D in = 0 such each is an ideal in the preceding and such that the quotient ij /ii+1 is abelian, i.e. [ig, i] C ij+1- Proof of the equivalence of these conditions. [g, g] is always an ideal in g so the Dig form a sequence of ideals demanded by 3), and hence 1) a 3). We also have the obvious implications 3) a 2) and 2) 4 1). So all these definitions are equivalent. Theorem 6 [Lie.] Let g be a solvable Lie algebra over an algebraically closed field k of characteristic zero, and (p, V) a finite dimensional representation of g. Then we can find a basis of V so that p(g) consists of upper triangular matrices. By induction on dim V this reduces to Theorem 7 [Lie.] Under the same hypotheses, there exists a (non-zero) com- mon eigenvector v for all the p(y), i.e. there is a vector v e V and a function x : g -> k such that p(y)v=x(y)v Vy e g. (4.1) Lemma 2 Suppose that i is an ideal of g and (4.1) holds for ally E i. Then x([x,h]) 0, Vxeg hei. Proof of lemma. For x e g let V be the subspace spanned by v, xv, ..., zx-- v and let n> 0 be minimal such that =17 V+1. So V is finite dimensional and zVn c V7k. Also V1 = Vn±k V k.  64 CHAPTER 4. ENGEL-LIE-CARTAN-WEYL Also, for h E i, (dropping the p) we have: hv = x(h)v hxv = xhv - [x,h]v x(h)xv mod V1 hx2v xhxv + [h,x]xv x(h)x2v +uxv, mod V u eI x(h)x2v+xu)xv mod V1 x(h)x2v mod V2 hxiv x(h)xiv mod Vi. Thus V is invariant under i and for each h E i, try v h = nx(h). In particular both x and h leave V invariant and triv, [x, h] = 0 since the trace of any commutator is zero. This proves the lemma. Proof of theorem by induction on dim g, which we may assume to be positive. Let m be any subspace of g with g D m D [g,g]. Then [g, m] c [g, g] c m so m is an ideal in g. In particular, we may choose m to be a subspace of codimension 1 containing [g, g]. By induction we can find a v E V and a x : m - k such that (4.1) holds for all elements of m. Let W := {w e Vhw =x(h)w V h e m}. If x E g, then hxw = xhw - [x, h]w =x(h)xw - x([x, h])w =x(h)xw since x([x, h]) = 0 by the lemma. Thus W is stable under all of g. Pick x E g, x m, and let v E W be an eigenvector of x with eigenvalue A, say. Then v is a simultaneous eigenvector for all of g with x extended as x(h + rx) = x(h) + rA. QED We had to divide by n in the above argument. In fact, the theorem is not true over a field of characteristic 2, with sl(2) as a counterexample. Applied to the adjoint representation, Lie's theorem says that there is a flag of ideals with commutative quotients, and hence [g, g] is nilpotent. 4.3 Linear algebra Let V be a finite dimensional vector space over an algebraically closed field of characteristic zero, and let det(TI - u) =HQ(T - AZ)""  4.3. LINEAR ALGEBRA 65 be the factorization of its characteristic polynomial where the A2 are distinct. Let S(T) be any polynomial satisfying S(T) =A2 mod (T - As)"', S(T) = 0 mod T, which is possible by the Chinese remainder theorem. For each i let Vi := the kernel of (u - A2)m. Then V =0®V and on Vi, the operator S (u) is just the scalar operator A2I. In particular s = S(u) is semi-simple (its eigenvectors span V) and, since s is a polynomial in u it commutes with u. So u= s+n where n = N(u), N(T ) = T - S(T ) is nilpotent. Also ns = sn. We claim that these two elements are uniquely determined by u =s + n, sn =ns, with s semisimple and n nilpotent. Indeed, since sn = ns, su = us so s(u - A.)k (-- -)ks so sV c V . Since s- u is nilpotent, s has the same eigenvalues on V as u does, i.e. A2. So s and hence n is uniquely determined. If P(T) is any polynomial with vanishing constant term, then if A C B are subspaces with uB c A then P(u)B c A. So, in particular, sB c A and 1nB CA. Define V,,q:=V 0V 0 - -- V®OV* .0 --V* with p copies of V and q copies of V*. Let u E End(V) act on V * by -u* and on Vpq by derivation, so , for example, v12 =v u 1 l 1 - 1 u* 0 l - 1 0 1 0 u*. Similarly, v11 acts on V1,1= V 0 V * by u11(x of) = uz o /- x u*L. Under the identification of V OV* with End(V), the element xO f acts on y E V by sending it into £(y)x. So the element u11( 0 £) sends y to £(y)u(x) - (u*)(y)x = £(y)u(x) - £(u(y))x. This is the same as the commutator of the operator u with the operator (cor- responding to) x 0 £ acting on y. In other words, under the identification of V®o V* with End(V), the linear transformation vi11 gets identified with advi.  66 CHAPTER 4. ENGEL-LIE-CARTAN-WEYL Proposition 2 If u = s + n is the decomposition of u then upq = spq + npq is the decomposition of upq . Proof. [spq, npq] = 0 and the tensor products of an eigenbasis for s is an eigenbasis for spq. Also npq is a sum of commuting nilpotents hence nilpotent. The map u H upq is linear hence upq = spq + npq. QED If # : k - k is a map, we define 5(s) by #(s)|v2 = 5(A2). If we choose a polynomial such that P(0) = 0, P(Ai) =_#(Ai) then P(u) =5(s). Proposition 3 Suppose that # is additive. Then (#(s)),q = #5(spq). Proof. Decompose Vpq into a sum of tensor products of the Vi or V*. On each such space we have #(sp,q) = #(Ail + -. - ..--. = (Ai) +#(.. (#(s))p,q where the middle equation is just the additivity. QED As an immediate consequence we obtain Proposition 4 Notation as above. If A C B C Vp,q with upgB C A then for any additive map, /(s)pqB C A Proposition 5 (over C) Let u = s + n as above. If tr(u#(s)) = 0 for /(s) = s then u is nilpotent. Proof. tr u5(s) = miiA2 = m)iil2. So the condition implies that all the Ai = 0. QED 4.4 Cartan's criterion. Let g c End(V) be a Lie subalgebra where V is finite dimensional vector space over C. Then g is solvable < tr(xy) = 0 Vx E g, y E [g, g]. Proof. Suppose g is solvable. Choose a basis for which g is upper triangular. Then every y E [g, g] has zeros on the diagonal, Hence tr(xy) = 0. For the reverse implication, it is enough to show that [g, g] is nilpotent, and, by Engel, that each u E [g, g] is nilpotent. So it is enough to show that tr us = 0, where s is the semisimple part of u, by Proposition 5 above. If it were true that s E g we would be done, but this need not be so. Write vixi M[I,2.  4.5. RADICAL. 67 Now for a, b, c E End(V) tr([a, b]c) = tr(abc - bac) tr(bca - bac) tr(b[c, a]) so tr(us)= tr([zi, y2] s) tr(y I~S, z2]). So it is enough to show that ads : g - [g, g]. We know that ad u : g - [g, g], and we can, by Lagrange interpolation, find a polynomial P such that P(u) = s. The result now follows from Prop. 4: Since End(V) ~ V1,1, take A = [g, g] and B = g. Then ad u = 1,1 so u1,1 g c [g, g[ and hence si,1g c [g, g] or [S, x] E [g, g] Vx E g. QED 4.5 Radical. If i is an ideal of g and g/i is solvable, then D(n)(g/i) = 0 implies that D(n)g C i. If i itself is solvable with D(m)i = 0, then D(m±n))g = 0. So we have proved: Proposition 6 If i c g is an ideal, and both i and g/i are solvable, so is g. If i and j are solvable ideals, then (i + j)/j - i/(i n j) is solvable, being the homomorphic image of a solvable algebra. So, by the previous proposition: Proposition 7 If i and j are solvable ideals in g so is i +j. In particular, every Lie algebra g has a largest solvable ideal which contains all other solvable ideals. It is denoted by rad g or simply by r when g is fixed. An algebra g is called semi-simple if rad g = 0. Since Di is an ideal whenever i is (by Jacobi's identity), if r / 0 then the last non-zero D(n)r is an abelian ideal. So an equivalent definition is: g is semi-simple if it has no non-zero abelian ideals. We shall call a Lie algebra simple if it is not abelian and if it has no proper ideals. We shall show in the next section that every semi-simple Lie algebra is the direct sum of simple Lie algebras in a unique way. 4.6 The Killing form. A bilinear form ( , ) : g x g - k is called invariant if ([x, y], z) + (y, [x, z]) = 0 Vx, y, z E g. (4.2) Notice that if ( , ) is an invariant form, and i is an ideal, then i' is again an ideal.  68 CHAPTER 4. ENGEL-LIE-CARTAN-WEYL One way of producing invariant forms is from representations: if(p, V) is a representation of g, then (x, y), := tr p(x)p(y) is invariant. Indeed, ([x, y], z), + (y, [x, z]), = tr{(p(x)p(y) - p(y)p(x))p(z)) + p(y)(p(x)p(z) - p(z)p(x))} = tr{p(x)p(y)p(z) - p(y)p(z)p(x)} = 0. In particular, if we take p = ad, V = g the corresponding bilinear form is called the Killing form and will be denoted by ( , ). We will also sometimes write K (x, y) instead of (x, y),. Theorem 8 g is semi-simple if and only if its Killing form is non-degenerate. Proof. Suppose g is not semi-simple and so has a non-zero abelian ideal, a. We will show that (x, y)h = 0 Vx E a, y E g. Indeed, let a-= ad x ad y. Then o maps g - a and a - 0. Hence in terms of a basis starting with elements of a and extending, it (is upper triangular and) has 0 along the diagonal. Hence tr o = 0. Hence if g is not semisimple then its Killing form is degenerate. Conversely, suppose that g is semi-simple. We wish to show that the Killing form is non-degenerate. So let u := g1'= {x tradxady = 0 Vy E g}. If x e u, z e g then tr{ad[x, z] ad y} = tr{ad x ad z ad y - ad z ad x ad y)} tr{adx(adzady - adyadz)} tr ad x ad[z, y] 0, so u is an ideal. In particular, tru (ad xu ad yu) = trg (adg x adg y) for x, y E u, as can be seen from a block decomposition starting with a basis of u and extending to g. If we take y E Du, we see that tr ad uD ad u = 0, so ad u is solvable by Cartan's criterion. But the kernel of the map u - ad u is the center of u. So if ad u is solvable, so is u. QED Proposition 8 Let g be a semisimple algebra, i any ideal of g, and i1 its orthocomplement with respect to its Killing form. Then i n i1 = 0. Indeed, i n i1 is an ideal on which tr ad x ad y - 0 hence is solvable by Cartan's criterion. Since g is semi-simple, there are no non-trivial solvable ideals. QED Therefore Proposition 9 Every semi-simple Lie algebra is the direct sum of simple Lie algebras.  4.7. COMPLETE REDUCIBILITY. 69 Proposition 10 Dg = g for a semi-simple Lie algebra. (Since this is true for each simple component.) Proposition 11 Let #b: g - s be a surjective homomorphism of a semi-simple Lie algebra onto a simple Lie algebra. Then if g =0) gj is a decomposition of g into simple ideals, the restriction, / of # to each summand is zero, except for one summand where it is an isomorphism. Proof. Since s is simple, the image of every #2 is 0 or all of s. If #2 is surjective for some i then it is an isomorphism since gj is simple. There is at least one i for which it is surjective since # is surjective. On the other hand, it can not be surjective for for two ideals, gj, gj i f j for then #[g2, g] = 0 [s, s] = s. QED 4.7 Complete reducibility. The basic theorem is Theorem 9 [Weyl.] Every finite dimensional representation of a semi-simple Lie algebra is completely reducible. Proof. 1. If p : g - End V is injective, then the form ( , ) is non-degenerate. Indeed, the ideal consisting of all x such that (x, y),= 0 Vy e g is solvable by Cartan's criterion, hence 0. 2. The Casimir operator. Let (e2) and (f) be bases of g which are dual with respect to some non-degenerate invariant bilinear form, (,). So (e2, fJ) = o. As the form is non-degenerate and invariant, it defines a map of g ® g FEnd g; x ® y(w) = (y, w)x. This map is an isomorphism and is a g morphism. Under this map, Ee2 o®f2(w) _=1:(w, f)e 2 = w by the definition of dual bases. Hence under the inverse map End g - g 0 g the identity element, id, corresponds to E e2 0 f2 (and so this expression is independent of the choice of dual bases). Since id is annihilated by commutator by any element of End(g), we conclude that >1 e2 0 f2 is annihilated by the action of all (ad x)2 = ad x 0 1 + 1 0 ad x, x E g. Indeed, for x, e, f, y E g we have ((adx)2(e o®f))y = (adxe o f +eoadxf)y (f,y)[x,e] + ([x, f], y)e (f,y)[x,e] - (f, [x,y])e by (4.2) ((adxz)(e O f) - (e®o f)(adxz)) y.  70 CHAPTER 4. ENGEL-LIE-CARTAN-WEYL Set C := e2- f E U(L). (4.3) Thus C is the image of the element >1 e2 0 f2 under the multiplication map gOg H- U(g), and is independent of the choice of dual bases. Furthermore, C is annihilated by ad x acting on U(g). In other words, it commutes with all elements of g, and hence with all of U(g); it is in the center of U(g). The C corresponding to the Killing form is called the Casimir element, its image in any representation is called the Casimir operator. 3. Suppose that p : g - End V is injective. The (image of the) central element corresponding to ( , ),p defines an element of End V denoted by Cp and tr C = =tr p(Z eifi) tr1p(e)p(fi) Z(eifi) dim g With these preliminaries, we can state the main proposition: Proposition 12 Let 0 - V - W - k - 0 be an exact sequence of g modules, where g is semi-simple, and the action of g on k is trivial (as it must be). Then this sequence splits, i.e. there is a line in W supplementary to V on which g acts trivially. The proof of the proposition and of the theorem is almost identical to the proof we gave above for the special case of sl(2). We will need only one or two additional arguments. As in the case of sl(2), the proposition is a special case of the theorem we want to prove. But we shall see that it is sufficient to prove the theorem. Proof of proposition. It is enough to prove the proposition for the case that V is an irreducible module. Indeed, if V1 is a submodule, then by induction on dim V we may assume the theorem is known for 0 - V/ V1 - W/V1 - k - 0 so that there is a one dimensional invariant subspace M in W/V1 supplementary to V/V1 on which the action is trivial. Let N be the inverse image of M in W. By another application of the proposition, this time to the sequence 0--> V1 ->N ->M-- 0 we find an invariant line, P, in N complementary to V1. So N = V1 e P. Since (W/V1) = (V/V1) e M we must have P n V =f{0}. But since dim W = dim V + 1, we must have W = V e P. In other words P is a one dimensional subspace of W which is complementary to V.  4.7. COMPLETE REDUCIBILITY. 71 Next we can reduce to proving the proposition for the case that g acts faithfully on V. Indeed, let i = the kernel of the action on V. For all x E g we have, by hypothesis, xW c V, and for x E i we have xV = 0. Hence Di acts trivially on W. But i = Di since i is semi-simple. Hence i acts trivially on W and we may pass to g/i. This quotient is again semi-simple, since i is a sum of some of the simple ideals of g. So we are reduced to the case that V is irreducible and the action, p, of g on V is injective. Then we have an invariant element C, whose image in End W must map W - V since every element of g does. (We may assume that g # 0.) On the other hand, C, 0, indeed its trace is dim g. The restriction of C, to V can not have a non-trivial kernel, since this would be an invariant subspace. Hence the restriction of Cp to V is an isomorphism. Hence ker Cp : W - V is an invariant line supplementary to V. We have proved the proposition. Proof of theorem from proposition. Let 0 - E' - E be an exact sequence of g modules, and we may assume that E' f 0. We want to find an invariant complement to E' in E. Define W to be the subspace of Homk (E, E') whose restriction to E' is a scalar times the identity, and let V C W be the subspace consisting of those linear transformations whose restrictions to E' is zero. Each of these is a submodule of End(E). We get a sequence 0 -V -W - k -0 and hence a complementary line of invariant elements in W. In particular, we can find an element, T which is invariant, maps E - E', and whose restriction to E' is non-zero. Then ker T is an invariant complementary subspace. QED As an illustration of construction of the Casimir operator consider g = sl(2) with h 1 0 e 0 1 0 0 h=0 -1' 0 0 ' 1 0' Then tr(ad h)2 = 8 tr(ad e) (ad f) = 4 so the dual basis to the basis h, e, f is h/8, f/4, e/4, or, if we divide the metric by 4, the dual basis is h/2, f, e and so the Casimir operator C is -h2+ef+fe= -h2+h+2fe. 2 2 This coincides with the C that we used in Chapter II.  72 CHAPTER 4. ENGEL-LIE- CARTAN- WEYL  Chapter 5 Conjugacy of Cartan subalgebras. It is a standard theorem in linear algebra that any unitary matrix can be di- agonalized (by conjugation by unitary matrices). On the other hand, it is easy to check that the subgroup T c U(n) consisting of all unitary matrices is a maximal commutative subgroup: any matrix which commutes with all diagonal unitary matrices must itself be diagonal; indeed if A is a diagonal matrix with distinct entries along the diagonal, any matrix which commutes with A must be diagonal. Notice that T is a product of circles, i.e. a torus. This theorem has an immediate generalization to compact Lie groups: Let G be a compact Lie group, and let T and T' be two maximal tori. (So T and T' are connected commutative subgroups (hence necessarily tori) and each is not strictly contained in a larger connected commutative subgroup). Then there exists an element a E G such that aT'a-1 = T. To prove this, choose one parameter subgroups of T and T' which are dense in each. That is, choose x and x' in the Lie algebra g of G such that the curve t - exp tx is dense in T and the curve t - exp tx' is dense in T'. If we could find a E G such that the a(exp tx')a-1 = exp t Ada z' commute with all the exp sx, then a(exp tz')a-1 would commute with all ele- ments of T, hence belong to T, and by continuity, aT'a-1 C T and hence = T. So we would like to find and a E G such that [Adax', z] = 0. Put a positive definite scalar product ( , ) on g, the Lie algebra of G which is invariant under the adjoint action of G. This is always possible by choosing any positive definite scalar product and then averaging it over G. Choose a E G such that (Ada c', x) is a maximum. Let y := Ada c'. 73  74 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. We wish to show that [y,x] = 0. For any z E g we have ([z,y],x) + (AdexptzXy,x) =0 0 by the maximality. But ([z,y],x) = (z, [y,x]) by the invariance of ( , ), hence [y, x] is orthogonal to all g hence 0. QED We want to give an algebraic proof of the analogue of this theorem for Lie algebras over the complex numbers. In contrast to the elementary proof given above for compact groups, the proof in the general Lie algebra case will be quite involved, and the flavor of the proof will by quite different for the solvable and semi-simple cases. Nevertheless, some of the ingredients of the above proof (choosing "generic elements" analogous to the choice of x and x' for example) will make their appearance. The proofs in this chapter follow Humphreys. 5.1 Derivations. Let 6 be a derivation of the Lie algebra g. this means that 6([y, z]) = [6(y), z] + [y, 6(z)] V y, z E g. Then, for a, b e C (6-a-b)[y,z] = [(6- a)y,z] + [y,(6-b)z] (6- a - b)2[y, z] = [(6 - a)2y, z] + 2[(6 - a)y, (6 - b)z] + [y, (6 - b)2z] (6-a-b)3[yz] [(-a)3y,z] +3[(6 -a)2y, (6 -b)z)1 + 3[(6 - a)y, (6 - b)2z] + [y, (6 - b)3z] (6-a-b)"[y,z] S ( [(o-a)ky,(6-b)-kz]. Consequences: " Let ga = ga(6) denote the generalized eigenspace corresponding to the eigenvalue a, so (6 - a)k = 0 on ga for large enough k. Then [ga,gi] C g[arb]- (5.1) " Let s = s(6) denote the diagonizable (semi-simple) part of 6, so that s(6) = a on ga. Then, for y E ga, z E gb s(6) [y, z] = (a + b)[y, z] = [s(6)y, z] + [y, s(6)z] so s and hence also n = n(6), the nilpotent part of 6 are both derivations.  5.1. DERIVATIONS. 75 " [6, adx] = ad(6x)]. Indeed, [6, ad x](u) = 6([xu]) - [x, 6(u)] = [6(x), u]. In particular, the space of inner derivations, Inn g is an ideal in Der g. " If g is semisimple then Inn g = Der g. Indeed, split off an invariant comple- ment to Inn g in Der g (possible by Weyl's theorem on complete reducibil- ity). For any 6 in this invariant complement, we must have [6, ad x] = 0 since [6, ad x] = ad 6x. This says that 6x is in the center of g. Hence x =0 Vx hence =0. " Hence any x E g can be uniquely written as x = s + n, s E g, n E g where ad s is semisimple and ad n is nilpotent. This is known as the decomposition into semi-simple and nilpotent parts for a semi-simple Lie algebra. " (Back to general g.) Let k be a subalgebra containing go(ad x) for some x E g. Then x belongs go(ad x) hence to k, hence ad x preserves Ng(k) (by Jacobi's identity). We have x e go(adx) c k c Ng(k) c g all of these subspaces being invariant under ad x. Therefore, the character- istic polynomial of ad x restricted to Ng (k) is a factor of the charactristic polynomial of ad x acting on g. But all the zeros of this characteristic polynomial are accounted for by the generalized zero eigenspace go (ad x) which is a subspace of k. This means that ad x acts on Ng(k)/k without zero eigenvalue. On the other hand, ad x acts trivially on this quotient space since x E k and hence [Ngk, x] c k by the definition of the normalizer. Hence Ng (k) = k. (5.2) We now come to the key lemma. Lemma 3 Let k c g be a subalgebra. Let z E k be such that go(ad z) does not strictly contain any go(ad x), x E k. Suppose that k c go(adz). Then go(adz) c go(ady) V ye k. Proof. Choose z as in the lemma, and let x be an arbitrary element of k. By hypothesis, x E go(ad z) and we know that [go(ad z), go(ad z)] c go(ad z). Therefore [x, go(ad z)] c go(ad z) and hence ad(z + cx)go (adz) c go (adz) for all constants c. Thus ad(z + cx) acts on the quotient space g/go (ad z). We can factor the characteristic polynomial of ad(z + cx) acting on g as Pad(z~cx)(T) -.f (T, c)g(T, c)  76 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. where f is the characteristic polynomial of ad(z + cx) on go(ad z) and g is the characteristic polynomial of ad(z + cx) on g/go (ad z). Write f (T, c) = Tr + f1(c)Tr- + . . . fr(c) r = dim go (adz) g(T,c) = T-r + g1(c)Tn-r- + -.-.-+ gnr(c) n = dim g. The f2 and the gj are polynomials of degree at most i in c. Since 0 is not an eigenvalue of ad z on g/go(ad z), we see that gn_r(0) # 0. So we can find r + 1 values of c for which gn_r (c) # 0, and hence for these values, go (ad(z + cx)) c go (adz). By the minimality, this forces go (ad(z + cx)) = go (adz) for these values of c. This means that f(T, c) = T' for these values of c, so each of the polynomials f1......, f, has r + 1 distinct roots, and hence is identically zero. Hence go (ad(z + cx)) D go (adz) for all c. Take c = 1, x= y - z to conclude the truth of the lemma. 5.2 Cartan subalgebras. A Cartan subalgebra (CSA) is defined to be a nilpotent subalgebra which is its own normalizer. A Borel subalgebra (BSA) is defined to be a maximal solvable subalgebra. The goal is to prove Theorem 10 Any two CSA's are conjugate. Any two BSA's are conjugate. Here the word conjugate means the following: Define N(g) ={xl ye g, a#0, withxega(ady)}. Notice that every element of NJ(g) is ad nilpotent and that NJ(g) is stable under Aut(g). As any x e AN(g) is nilpotent, exp ad x is well defined as an automorphism of g, and we let E(g) denote the group generated by these elements. It is a normal subgroup of the group of automorphisms. Conjugacy means that there is a 8EE(g) with #(h1) = h2 where h1 and h2 are CSA's. Similarly for BSA's. As a first step we give an alternative characterization of a CSA. Proposition 13 h is a CSA if and only if h= go(ad z) where go(adz) con- tains no proper subalgebra of the form go (ad z).  5.3. SOLVABLE CASE. 77 Proof. Suppose h = go(ad z) which is minimal in the sense of the proposition. Then we know by (5.2) that h is its own normalizer. Also, by the lemma, h c go(ad x) Vx E h. Hence ad x acts nilpotently on h for all x E h. Hence, by Engel's theorem, h is nilpotent and hence is a CSA. Suppose that h is a CSA. Since h is nilpotent, we have h c go(ad x), V x E h. Choose a minimal z. By the lemma, go(adz) c go(ad x) Vx e h. Thus h acts nilpotently on go(ad z)/h. If this space were not zero, we could find a non-zero common eigenvector with eigenvalue zero by Engel's theorem. This means that there is a y g h with [y, h] c h contradicting the fact h is its own normalizer. QED Lemma 4 If #5: g - g' is a surjective homomorphism and h is a CSA of g then #(h) is a CSA of g'. Clearly #z(h) is nilpotent. Let k = Ker # and identify g' = g/k so #(h) = h+k. If x + k normalizes h + k then x normalizes h + k. But h = go (ad z) for some minimal such z, and as an algebra containing a go(ad z), h+k is self-normalizing. So x E h + k. QED Lemma 5 #5: g - g' be surjective, as above, and h' a CSA of g'. Any CSA h of m :=#-1(h') is a CSA of g. h is nilpotent by assumption. We must show it is its own normalizer in g. By the preceding lemma, #(h) is a Cartan subalgebra of h'. But #(h) is nilpotent and hence would have a common eigenvector with eigenvalue zero in h'/#(h), contradicting the selfnormalizing property of #(h) unless #(h) = h'. So #(h) h'. If x E g normalizes h, then 5(x) normalizes h'. Hence 5(x) e h' so x e m so x E h. QED 5.3 Solvable case. In this case a Borel subalgebra is all of g so we must prove conjugacy for CSA's. In case g is nilpotent, we know that any CSA is all of g, since g = go(adz) for any z E g. So we may proceed by induction on dim g. Let h1 and h2 be Cartan subalgebras of g. We want to show that they are conjugate. Choose an abelian ideal a of smallest possible positive dimension and let g' = g/a. By Lemma 4 the images hl and h2 of h1 and h2 in g' are CSA's of g' and hence there is a o'e E E(g') with o'(hl) = h2. We claim that we can lift this to a o- E (g). That is, we claim Lemma 6 Let # : g -> g' be a surjective homomorphism. If o-' e E(g') then  78 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. there exists a a E E(g) such that the diagram g > g g > g' commutes. Proof of lemma. It is enough to prove this on generators. Suppose that x' E ga(y') and choose y e g, c(y) =y' so (ga(y)) = ga(y'), and hence we can find an x e N(g) mapping on to x'. Then exp ad x is the desired a in the above diagram if a' = exp ad x'. QED Back to the proof of the conjugacy theorem in the solvable case. Let m1 --1(h'), m2:=#--1(h2). We have a a with u(m1) = m2 so u(h1) and h2 are both CSA's of m2. If m2 / g we are done by induction. So the one new case is where g=a+hi =a+h2. Write h2 = go(ad x) for some c e g. Since a is an ideal, it is stable under ad c and we can split it into its 0 and non-zero generalized eigenspaces: a = ao (ad c) e a,(ad c). Since a is abelian, ad of every element of a acts trivially on each summand, and since h2 = go (ad x) and a is an ideal, this decomposition is stable under h2, hence under all of g. By our choice of a as a minimal abelian ideal, one or the other of these summands must vanish. If a = ao(ad x) we would have a c h2 so g = h2 and g is nilpotent. There is nothing to prove. So the only case to consider is a= a, (ad c). Since h2 = go(ad x) we have a =g(ad c). Since g = hi + a, write x=y+z, yeh1, zeg*(adc). Since adcx is invertible on g, (ad x), write z = [c, z'], z' e a,(ad c). Since a is an abelian ideal, (ad z')2 = 0, so exp(adz') = 1 + adz'. So exp(adz') =x - z =y. So h:= go(ad y) is a CSA (of g), and since y e h1 we have h1 c go(ady) = h and hence hi1= h. So exp ad z' conjugates h2 into h1. Writing z' as sum of its generalized eigencomponents, and using the fact that all the elements of a commute, we can write the exponential as a product of the exponentials of the summands. QED  5.4. TORAL SUBALGEBRAS AND CARTAN SUBALGEBRAS. 79 5.4 Toral subalgebras and Cartan subalgebras. The strategy is now to show that any two BSA's of an arbitrary Lie algebra are conjugate. Any CSA is nilpotent, hence solvable, hence contained in a BSA. This reduces the proof of the conjugacy theorem for CSA's to that of BSA's as we know the conjugacy of CSA's in a solvable algebra. Since the radical is contained in any BSA, it is enough to prove this theorem for semi-simple Lie algebras. So for this section the Lie algebra g will be assumed to be semi-simple. Since g does not consist entirely of ad nilpotent elements, it contains some x which is not ad nilpotent, and the semi-simple part of x is a non-zero ad semi- simple element of g. A subalgebra consisting entirely of semi-simple elements is called toral, for example, the line through x. Lemma 7 Any toral subalgebra t is abelian. Proof. The elements ad x, x E t can be each be diagonalized. We must show that ad x has no eigenvectors with non-zero eigenvalues in t. Let y be an eigenvector so [x, y] = ay. Then (ad y)x = -ay is a zero eigenvector of ad y, which is impossible unless ay = 0, since ad y annihilates all its zero eigenvectors and is invertible on the subspace spanned by the eigenvectors corresponding to non-zero eigenvalues. QED One of the consequences of the considerations in this section will be: Theorem 11 A subalgebra h of a semi-simple Lie algebra gis a CSA if and only if it is a maximal toral subalgebra. To prove this we want to develop some of the theory of roots. So fix a maximal toral subalgebra h. Decompose g into simultaneous eigenspaces g = Cg(h) (1ga(h) where Cg(h) := {x E g|[h,x] =0 V h e h} is the centralizer of h, where a ranges over non-zero linear functions on h and ga (h) := {x ECg|[h, x] = a(h)x V h ECh}. As h will be fixed for most of the discussion, we will drop the (h) and write g =go e e ga where go = Cg(h). We have * [ga, go] C ga+ (by Jacobi) so " ad x is nilpotent if x e ga, a 0 " If a +,3 0then ,kjx,y)=0 Vx Cg, ye go.  80 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. The last item follows by choosing an h E h with a(h) + 3(h) # 0. Then 0 = i([h, x], y) + i(x, [h, y])= (a(h) + /3(h))I(x, y) so i(x, y) = 0. This implies that go is orthogonal to all the go, a 0 and hence the non-degeneracy of k implies that Proposition 14 The restriction of k to go x go is non-degenerate. Our next intermediate step is to prove: Proposition 15 h = go (5.3) if h is a maximal toral subalgebra. Proceed according to the following steps: x E go xs E go xn E go. (5.4) Indeed, x E go <-> ad x : h -0, and then ad x, ad xn also map h -0. x E go, x semisimple 4 x e h. (5.5) Indeed, such an x commutes with all of h. As the sum of commuting semi- simple transformations is again semisimple, we conclude that h + Cx is a toral subalgebra. By maximality it must coincide with h. We now show that Lemma 8 The restriction of the Killing form , to h x h is non-degenerate. So suppose that i(h, x) = 0 V x E h. This means that K (h, x) = 0 V semi- simple x e go. Suppose that n e go is nilpotent. Since h commutes with n, (ad h) (ad n) is again nilpotent. Hence has trace zero. Hence i (h, n) = 0, and therefore i(h, x) = 0 V x E go. Hence h = 0. QED Next observe that Lemma 9 go is a nilpotent Lie algebra. Indeed, all semi-simple elements of go commute with all of go since they belong to h, and a nilpotent element is ad nilpotent on all of g so certainly on go. Finally any x E go can be written as a sum X. + xn of commuting elements which are ad nilpotent on go, hence x is. Thus go consists entirely of ad nilpotent elements and hence is nilpotent by Engel's theorem. QED Now suppose that h E h, x, y E go. Then K (h, [x, y]) = ([h, z], y) =(0, y) -0 and hence, by the non-degeneracy of i on h, we conclude that  5.5. ROOTS. 81 Lemma 10 h n [go, go] = 0. We next prove Lemma 11 go is abelian. Suppose that [go, go] # 0. Since go is nilpotent, it has a non-zero center con- tained in [go, go]. Choose a non-zero element z E [go, go] in this center. It can not be semi-simple for then it would lie in h. So it has a non-zero nilpotent part, n, which also must lie in the center of go, by the B c A theorem we proved in our section on linear algebra. But then ad n ad x is nilpotent for any x E go since [x, n] = 0. This implies that K(n, go) = 0 which is impossible. QED Completion of proof of (5.3). We know that go is abelian. But then, if h # go, we would find a non-zero nilpotent element in go which commutes with all of go (proven to be commutative). Hence i(n, go) = 0 which is impossible. This completes the proof of (5.3). QED So we have the decomposition g =he e g a#O which shows that any maximal toral subalgebra h is a CSA. Conversely, suppose that h is a CSA. For any x =8z + x E g, go(ad xs) C go (ad x) since xn is an ad nilpotent element commuting with ad x. If we choose x E h minimal so that h = go(ad x), we see that we may replace x by x and write h = go (ad x). But go (ad x) contains some maximal toral algebra containing x, which is then a Cartan subalgebra contained in h and hence must coincide with h. This completes the proof of the theorem. QED 5.5 Roots. We have proved that the restriction of i to h is non-degenerate. This allows us to associate to every linear function # on h the unique element to E h given by #(h) = (te, h). The set of a E h*, a # 0 for which g6 # 0 is called the set of roots and is denoted by 4. We have " 4 spans h* for otherwise 1h # 0 :a(h) = 0 Va E 4 implying that [h, ga] = 0 Va so [h, g] = 0. " a E 4 -a E 4 for otherwise ga I g. " x E ga, y E g-a, a E 4P [x,y] = (x, y)ta. Indeed, S(h, [x, y]) = ([h, x], y) =I (ta, h)k(x, y)  82 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. " [ga, g_,] is one dimensional with basis t,. This follows from the preceding and the fact that g, can not be perpendicular to g_, since otherwise it will be orthogonal to all of g. " a(ta) =_(ta, ta) # 0. Otherwise, choosing x E ga, y E g-a with i(x, y) 1, we get [z, y] = to, [ta, x] = [ta,y] = 0. So x, y, to span a solvable three dimensional algebra. Acting as ad on g, it is superdiagonizable, by Lie's theorem, and hence ad to, which is in the commutator algebra of this subalgebra is nilpotent. Since it is ad semi- simple by definition of h, it must lie in the center, which is impossible. " Choose ec, e gaf, e gc, with 2 tc,,fc,) t) Set 2 hc K(ta, t t) Then ems, fa, hc, span a subalgebra isomorphic to sl(2). Call it sl(2)c,. We shall soon see that this notation is justified, i.e that g6, is one dimensional and hence that sl(2)c, is well defined, independent of any "choices" of ea, f, but depends only on a. " Consider the action of sl(2)c, on the subalgebra m:= heO gnc, where n E Z. The zero eigenvectors of hc, consist of h c m. One of these corresponds to the adjoint representation of sl(2)c, c m. The orthocomplement of hc, E h gives dim h -1 trivial representations of sl(2)c,. This must exhaust all the even maximal weight representations, as we have accounted for all the zero weights of sl(2)c, acting on g. In particular, dim g, = 1 and no integer multiple of a other than -a is a root. Now consider the subalgebra p := h e ®gca, c E C. This is a module for sl(2)c,. Hence all such c's must be multiples of 1/2. But 1/2 can not occur, since the double of a root is not a root. Hence the ia are the only multiples of a which are roots. Now consider 3 E 4, 3 ia. Let k.:= 9 gaa. Each non-zero summand is one dimensional, and k is an sl(2)c, module. Also 13 + ia 0 for any i, and evaluation on hc, gives 3(hc,) + 2i. All weights differ by multiples of 2 and so k is irreducible. Let q be the maximal integer so that 13 + qa E E , and r the maximal integer so that 3 - ra E . Then the entire string # - ra,# - (r -1)a, . . .+qa  5.5. ROOTS. 83 are roots, and #3(ha) - 2r -(3(ha) + 2q) or 3(ha) = r - q E Z. These integers are called the Cartan integers. We can transfer the bilinear form i from h to h* by defining (So, ) So j3(ha) i(t, hta) 2i (to, tc,) i(tce, tc) 2(,3,a) (a,a) r - q E Z. So 2(/3,c) (a,a) Choose a basis a1,... , ao of h* consisting of roots. This is possible because the roots span h*. Any root 3 can be written uniquely as linear combination #3 = cia1 +.-.-+ cfza where the c2 are complex numbers. We claim that in fact the c2 are rational numbers. Indeed, taking the scalar product relative to ( , ) of this equation with the a2 gives the £ equations (#, a) = c1(a1, a2) + . . . + ce(as, a2). Multiplying the i-th equation by 2/(ai, a ) gives a set of £ equations for the £ coefficients c2 where all the coefficients are rational numbers as are the left hand sides. Solving these equations for the c2 shows that the c2 are rational. Let E be the real vector space spanned by the a E (b. Then ( , ) restricts to a real scalar product on E. Also, for any A / 0 E E, (A, A) :=: I (tA, tA) tr(adtA)2 - Z (tA\)2 cE< > 0. So the scalar product ( , ) on E is positive definite. E is a Euclidean space. In the string of roots, 3 is q steps down from the top, so q steps up from the bottom is also a root, so #3 -(r - q)a  84 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. is a root, or /3-2(3,a) x (a, a) But 2(,3a) # a = sa (#3) (ala) where se denotes Euclidean reflection in the hyperplane perpendicular to a. In other words, for every a E 4 s: D . (5.6) The subgroup of the orthogonal group of E generated by these reflections is called the Weyl group and is denoted by W. We have thus associated to every semi-simple Lie algebra, and to every choice of Cartan subalgebra a finite subgroup of the orthogonal group generated by reflections. (This subgroup is finite, because all the generating reflections, sc, and hence the group they generate, preserve the finite set of all roots, which span the space.) Once we will have completed the proof of the conjugacy theorem for Cartan subalgebras of a semi-simple algebra, then we will know that the Weyl group is determined, up to isomorphism, by the semi-simple algebra, and does not depend on the choice of Cartan subalgebra. We define 2(3,a) K3 z) (aa) So 3,a) = 3(hc,) (5.7) Sr - q E Z (5.8) and sc(3) /3 - (#, a)a. (5.9) So far, we have defined the reflection sc, purely in terms of the root struction on E, which is the real subspace of h* generated by the roots. But in fact, s., and hence the entire Weyl group arises as (an) automorphism(s) of g which preserve h. Indeed, we know that ems, fa, hc, span a subalgebra sl(2)c, isomorphic to sl(2). Now exp adec, and exp ad(-fc,) are elements of E(g). Consider Ta (expadec,)(expad(-fc,))(expadec,) e E(g). (5.10) We claim that Proposition 16 The automorphism Tc, preserves h and on h it is given by Tc,(h) = h - a(h)ha. (5.11) In particular, the transformation induced by Tre on E is s,.  5.6. BASES. 85 Proof. It suffices to prove (5.11). If a(h) = 0, then both ad e, and ad fa vanish on h so T(h) = h and (5.11) is true. Now hc, and ker a span h. So we need only check (5.11) for hc, where it says that T(ha) =_-ha. But we have already verified this for the algebra sl(2). QED We can also verify (5.11) directly. We have exp(ad ec)(h) = h - a(h)ec, for any h E h. Now [fa,ec] =-hc, so (ad fc)2(ec,) = (fc,,-hc,] = [hc,, fc] =-2fc,. So 1 exp(-ad f)(expade)h = (id -ad f + -(adfa)2)(h-a(h)ea) 2 = h- a(h)e, -oa(h)fc, -a(h)h, + a(h)fc, = h-a(h)hc - a(h)ec. If we now apply exp ad ec, to this last expression and use the fact that a(hc,) = 2, we get the right hand side of (5.11). 5.6 Bases. A c 4 is a called a Base if it is a basis of E (so #A = = dimRE = dimch) and every 3 E can be written as E e kaa, k, E Z with either all the coefficients kc, > 0 or all < 0. Roots are accordingly called positive or negative and we define the height of a root by ht # := kc,. Given a base, we get partial order on E by defining A > p if A - p is a sum of positive roots or zero. We have (a,/3) 0, a, # E A (5.12) since otherwise (a, 3) > 0 and x(3) / 2(#,a) (aa) is a root with the coefficient of /3= 1 > 0 and the coefficient of a < 0, contradict- ing the definition which says that roots must have all coefficients non-negative or non-positive. To construct a base, choose a 7' E E, ('y, /3) / 0 V /3 E P. Such an element is called regular. Then every root has positive or negative scalar product with 7y, dividing the set of roots into two subsets:  86 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. A root 3 E + is called decomposable if 3 =31 + /32,131, /32 E 4+, indecom- posable otherwise. Let A(y) consist of the indecomposable elements of +(y). Theorem 12 A(y) is a base, and every base is of the form A(y) for some 'y. Proof. Every 3 E+ can be written as a non-negative integer combination of A(y) for otherwise choose one that can not be so written with ('y,3) as small as possible. In particular, /3 is not indecomposable. Write /3 =13 + /32, /3 E 4+. Then /3 A(y),('y,13) =(,/31) + (7,/32) and hence (',/3i) < ('y,13) and (y, /32) < (y, /3). By our choice of /3 this means #13 and /32 are non-negative integer combinations of elements of A(y) and hence so is /3, contradiction. Now (5.12) holds for Ao= A(y) for if not, a-/3 is a root, so either a-/3 E+ so a =a -/3 + /3 is decomposable or /3 - a E E+ and /3 is decomposable. This implies that A(y) is linearly independent: for suppose E cca = 0 and let pc be the positive coefficients and -qo the negative ones, so E paa = qg# a 5q0 / all coefficients positive. Let c be this common vector. Then (E, c) = pcq'(a, /3) 0 so c= 0 which is impossible unless all the coefficients vanish, since all scalar products with ry are strictly positive. Since the elements of 1 span E this shows that A(y) is a basis of E and hence a base. Now let us show that every base is of the desired form: For any base A, let 4+ 4+(A) denote the set of those roots which are non-negative integral combinations of the elements of A and let T- = -(A) denote the ones which are non-positive integral combinations of elements of A. Define 6, a E A to be the projection of a onto the orthogonal complement of the space spanned by the other elements of the base. Then (6oa 0') = 0, ac 0a', (o,ca) = (&o, &c) > 0 so y = Eraoa, rc, > 0 satisfies (y,z) >0 Va e A hence and P--(A) c --Y) hence '+()4 I+() and --(A) = _ -(y) Since every element of 4+ can be written as a sum of elements of A with non- negative integer coefficients, the only indecomposable elements can be the A, so A(-) c A but then they must be equal since they have the same cardinality £= dim E. QED  5.7. WEYL CHAMBERS. 87 5.7 Weyl chambers. Define P 3:=1. Then E - U P,§J is the union of Weyl chambers each con- sisting of regular y's with the same 4+. So the Weyl chambers are in one to one correspondence with the bases, and the Weyl group permutes them. Fix a base, A. Our goal in this section is to prove that the reflections s,, a e A generate the Weyl group, W, and that W acts simply transitively on the Weyl chambers. Each so, a E A sends a H -a. But acting on A = c013, the reflection s, does not change the coefficient of any other element of the base. If A E 4+ and A f a, we must have co > 0 for some 3 a in the base A. Then the coefficient of 3 in the expansion of sc (A) is positive, and hence all its coefficients must be non-negative. So s,(A) E +. In short, the only element of 4+ sent into f- is a. So if 1 : #3then so= 6-a. If 3 E 4+, 3 gA, then we can not have (3,a)<;0 Va EA0for then3U A would be linearly independent. So 3 - a is a root for some a E A, and since we have changed only one coefficient, it must be a positive root. Hence any /3 e can be written as #3=ai+---+ap a eA where all the partial sums are positive roots. Let 7i be any vector in a Euclidean space, and let sy denote reflection in the hyperplane orthogonal to 7y. Let R be any orthogonal transformation. Then SRy = RsR1 (5.13) as follows immediately from the definition. Let ai, ... , a2 E A, and, for short, let us write si := s,2. Lemma 12 If si-... si_1ai <0 then ]j < i, j > 1 so that Si ... si = Si ... Sj-lsj+1... si-1- Proof. Set 3i-1 := a, 3:= sj+1 ... si-ai, j < i - 1. Since /3i-1 E4+ and 30 E J there must be some j for which / E 4+ and s3 =3j-1 E implying that that /3= a3 so by (5.13) with R = sj+1' "3si-1 we conclude that s = (s+i ... si-1)si(sj+1''.'si-1)- or s sj+1 ... Si = Sj+1 ... Si-1 implying the lemma. QED As a consequence, if s =si - - -sis a shortest expression for s, then, since stat E ~, we must have sat E V.  88 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. Keeping A fixed in the ensuing discussion, we will call the elments of A sim- ple roots, and the corresponding reflections simple reflections. Let W' denote the subgroup of W generated by the simple reflections, so, a E A. (Eventually we will prove that this is all of W.) It now follows that if s e W' and sA0= A then s = id. Indeed, if s # id, write s in a minimal fashion as a product of simple reflections. By what we have just proved, it must send some simple root into a negative root. So W' permutes the Weyl chambers without fixed points. We now show that W' acts transitively on the Weyl chambers: Let y E E be a regular element. We claim I s E W' with (s(y), a) > OV aEA. Indeed, choose s E W' so that (s(7), 6) is as large as possible. Then (s(7), 6) > (s s(7), 6) (s(7),se6) (s(-Y),6) - (s(-Y), a) so (s(7),a) > 0 Va EA. We can't have equality in this last inequality since s(-y) is not orthogonal to any root. This proves that W' acts transitively on all Weyl chambers and hence on all bases. We next claim that every root belongs to at least one base. Choose a (non- regular) 'y' I a, but y' y Pp, # 3 a. Then choose ry close enough to 'y' so that (y, a) > 0 and (y, a) < (-y, 3) V 3 a. Then in +(y) the element a must be indecomposable. If 3 is any root, we have shown that there is an s' E W' with s'/3 = a2 E A. By (5.13) this implies that every reflection spin W is conjugate by an element of W' to a simple reflection: so = s'sis'1 E W'. Since W is generated by the so, this shows that W' = W. 5.8 Length. Define the length of an element of W as the minimal word length in its expression as a product of simple roots. Define n(s) to be the number of positive roots made negative by s. We know that n(s) =_£(s) if £(s) = 0 or 1. We claim that £(s) = n(s) in general. Proof by induction on £(s). Write s = si -" -"-s2 in reduced form and let a = a. We have sa E J-. Then n(ss2) = n(s) - 1 since s2 leaves all positive roots positive except a. Also £(ss2) =_£(s) - 1. So apply induction. QED Let C = C(A) be the Weyl chamber associated to the base A. Let C denote its closure. Lemma 13 If A, p E C and s E W satisfies s\ = then s is a product of simple reflections which fix A. In particular, A = p,. So C is a fundamental domain for the action of W on E.  5.9. CONJUGACY OF BOREL SUBALGEBRAS 89 Proof. By induction on £(s). If £(s) = 0 then s = id and the assertion is clear with the empty product. So we may assume that n(s) > 0, so s sends some positive root to a negative root, and hence must send some simple root to a negative root. So let a E A be such that sa E E . Since p E C, we have (p,3 ) >0, V/3 E + and hence (p,sa) <0. So 0 > (psa) (A, a) > 0. So (A,a) = 0 so sA)= A and hence ssA =tp. But n(ss,) = n(s) - 1 since sa = -a and s, permutes all the other positive roots. So £(ssa) =_£(s) - 1 and we can apply induction to conclude that s = (ssa)sa is a product of simple reflections which fix A. 5.9 Conjugacy of Borel subalgebras We need to prove this for semi-simple algebras since the radical is contained in every maximal solvable subalgebra. Define a standard Borel subalgebra (relative to a choice of CSA h and a system of simple roots, A) to be b(0) := h Eg. (A) Define the corresponding nilpotent Lie algebra by n+(A) :=e) go. Since each sa can be realized as (exp ec) (exp -fa)(exp ec) every element of W can be realized as an element of S(g). Hence all standard Borel subalgebras relative to a given Cartan subalgebra are conjugate. Notice that if x normalizes a Borel subalgebra, b, then [b+Cx,b+Cx] c b and so b + Cx is a solvable subalgebra containing b and hence must coincide with b: Ng (b) = b. In particular, if x E b then its semi-simple and nilpotent parts lie in b. From now on, fix a standard BSA, b. We want to prove that any other BSA, b' is conjugate to b. We may assume that the theorem is known for Lie algebras of smaller dimension, or for b' with b n b' of greater dimension, since if dim  90 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS. b n b' = dim b, so that b' D b, we must have b' = b by maximality. Therefore we can proceed by downward induction on the dimension of the intersection bnb'. Suppose b n b' f 0. Let n' be the set of nilpotent elements in b n b'. So n' =n+n b'. Also [b n b',b n b'] C n+ n b'= n' so n' is a nilpotent ideal in b n b'. Suppose that n' f 0. Then since g contains no solvable ideals, k =Ng(n') # g. Consider the action of n' on b/(bnb'). By Engel, there exists a y b nb' with [x, y] e bn b' Vx e n'. But [x, y] e [b, b] C n+ and so [x, y] E n'. So y e k. Thus y E k n b, y b n b'. Similarly, we can interchange the roles of b and b' in the above argument, replacing n+ by the nilpotent subalgebra [b', b'] of b', to conclude that there exists a y' E k n b', y' b n b'. In other words, the inclusions knbDbnb', knb'Dbnb' are strict. Both b n k and b' n k are solvable subalgebras of k. Let c, c' be BSA's containing them. By induction, there is a a e E(k) c E(g) with ar(c') = c. Now let b" be a BSA containing c. We have b" n b D c n b D k n b D b' n b with the last inclusion strict. So by induction there is a T E E(g) with T(b") = b. Hence TO(c') C b. Then b 0 T(b') D TO(c') n T(b') D T(b' n k) D TO(b n b') with the last inclusion strict. So by induction we can further conjugate Tub' into b. So we must now deal with the case that n' = 0, but we will still assume that b n b' # 0. Since any Borel subalgebra contains both the semi-simple and nilpotent parts of any of its elements, we conclude that b n b' consists entirely of semi-simple elements, and so is a toral subalgebra, call it t. If x E b, t E t b n b' and [x, t] E t, then we must have [x, t] = 0, since all elements of [b, b] are nilpotent. So Nb(t) = Cb(t). Let c be a CSA of Cb(t). Since a Cartan subalgebra is its own normalizer, we have t c c. So we have t C C C Cb(t)= Nb(t) c Nb(c). Let t E t, n E Nb(c). Then [t, n] E c and successive brackets by t will eventually yield 0, since e is nilpotent. Thus (ad t)knr= 0 for some k, and since t is semi- simple, [t, n] 0. Thus n~ E Ob(t) and hence n~ E c since c is its own normalizer  5.9. CONJUGACY OF BOREL SUBALGEBRAS 91 in Cb(t). Thus c is a CSA of b. We can now apply the conjugacy theorem for CSA's of solvable algebras to conjugate c into h. So we may assume from now on that t c h. If t = h, then decomposing b' into root spaces under h, we find that the non-zero root spaces must consist entirely of negative roots, and there must be at least one such, since b' f h. But then we can find a Tr which conjugates this into a positive root, preserving h, and then Tr(b') n b has larger dimension and we can further conjugate into b. So we may assume that tch is strict. If b' C Cg(t) then since we also have h c Cg(t), we can find a BSA, b" of Cg(t) containing h, and conjugate b' to b", since we are assuming that t # 0 and hence Cg(t) # g. Since b" n b D h has bigger dimension than b' n b, we can further conjugate to b by the induction hypothesis. If b' Cg(t) then there is a common non-zero eigenvector for ad t in b', call it x. So there is a t' E t such that [t', x] = c'x, c' # 0. Setting t 1 t C/ we have [t, x] = x. Let J c 1 consist of those roots for which #3(t) is a positive rational number. Then s := h e ® g is a solvable subalgebra and so lies in a BSA, call it b". Since t c b", x E b" we see that b" n b' has strictly larger dimension than b n b'. Also b" n b has strictly larger dimension than b n b' since h c b n b". So we can conjugate b' to b" and then b" to b. This leaves only the case b n b' = 0 which we will show is impossible. Let t be a maximal toral subalgebra of b'. We can not have t = 0, for then b' would consist entirely of nilpotent elements, hence nilpotent by Engel, and also self-normalizing as is every BSA. Hence it would be a CSA which is impossible since every CSA in a semi-simple Lie algebra is toral. So choose a CSA, h" containing t, and then a standard BSA containing h". By the preceding, we know that b' is conjugate to b" and, in particular has the same dimension as b". But the dimension of each standard BSA (relative to any Cartan subalgebra) is strictly greater than half the dimension of g, contradicting the hypothesis gDbe b'. QED  92 CHAPTER 5. CONJUGACY OF CARTAN SUBALGEBRAS.  Chapter 6 The simple finite dimensional algebras. In this chapter we classify all possible root systems of simple Lie algebras. A consequence, as we shall see, is the classification of the simple Lie algebras themselves. The amazing result - due to Killing with some repair work by Elie Cartan - is that with only five exceptions, the root systems of the classical algebras that we studied in Chapter III exhaust all possibilities. The logical structure of this chapter is as follows: We first show that the root system of a simple Lie algebra is irreducible (definition below). We then develop some properties of the of the root structure of an irreducible root system, in particular we will introduce its extended Cartan matrix. We then use the Perron-Frobenius theorem to classify all possible such matrices. (For the expert, this means that we first classify the Dynkin diagrams of the affine algebras of the simple Lie algebras. Surprisingly, this is simpler and more efficient than the classification of the diagrams of the finite dimensional simple Lie algebras themselves.) From the extended diagrams it is an easy matter to get all possible bases of irreducible root systems. We then develop a few more facts about root systems which allow us to conclude that an isomorphism of irreducible root systems implies an isomorphism of the corresponding Lie algebras. We postpone the the proof of the existence of the exceptional Lie algebras until Chapter VIII, where we prove Serre's theorem which gives a unified presentation of all the simple Lie algebras in terms of generators and relations derived directly from the Cartan integers of the simple root system. Throughout this chapter we will be dealing with semi-simple Lie algebras over the complex numbers. 93  94 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. 6.1 Simple Lie algebras and irreducible root sys- tems. We choose a Cartan subalgebra h of a semi-simple Lie algebra g, so we have the corresponding set 4 of roots and the real (Euclidean) space E that they span. We say that 4 is irreducible if 4 can not be partitioned into two disjoint subsets 4 1 U @2 such that every element of 1 is orthogonal to every element of 42. Proposition 17 If g is simple then 1 is irreducible. Proof. Suppose that 4 is not irreducible, so we have a decomposition as above. If a E I and13 E E2 then (a +13, a) = (a, a) > 0 and (a + 13,13)=(,3)>0 which means that a +#13 can not belong to either 4P1 or 12 and so is not a root. This means that [ga,g] = 0. In other words, the subalgebra g1 of g generated by all the go, a E P is centralized by all the go, so g1 is a proper subalgebra of g, since if gi = g this would say that g has a non-zero center, which is not true for any semi-simple Lie algebra. The above equation also implies that the normalizer of g1 contains all the g, where ry ranges over all the roots. But these g, generate g. So g1 is a proper ideal in g, contradicting the assumption that g is simple. QED Let us choose a base A for the root system 4 of a semi-simple Lie algebra. We say that A is irreducible if we can not partition A into two non-empty mutually orthogonal sets as in the definition of irreducibility of 4 as above. Proposition 18 1 is irreducible if and only if A is irreducible. Proof. Suppose that 4 is not irreducible, so has a decomposition as above. This induces a partition of A which is non-trivial unless A is wholly contained in 41 or 4D2. If A C l1 say, then since E is spanned by A, this means that all the elements of 12 are orthogonal to E which is impossible. So if A is irreducible so is 4. Conversely, suppose that A =A1 U A2 is a partition of A into two non-empty mutually orthogonal subsets. We have proved that every root is conjugate to a simple root by an element of the Weyl group W which is generated by the simple reflections. Let J1 consist of those roots which are conjugate to an element of A1 and P2 consist of those roots which are conjugate to an element of A2. The reflections sp, #3 E A2 commute with the reflections so, a E A1, and furthermore so (a) = a  6.2. THE MAXIMAL ROOT AND THE MINIMAL ROOT. 95 since (a, 3) = 0. So any element of P1 is conjugate to an element of A1 by an element of the subgroup W1 generated by the so, a e A1. But each such reflection adds or subtracts a. So 1 is in the subspace E1 of E spanned by A1 and so is orthogonal to all the elements of 42. So if P1 is irreducible so is A. QED We are now into the business of classifying irreducible bases. 6.2 The maximal root and the minimal root. Suppose that 4 is an irreducible root system and A a base, so irreducible. Recall that once we have chosen A, every root 3 is an integer combination of the elements of A with all coefficients non-negative, or with all coefficients non- positive. We write 13 >- 0 in the first case, and /3 - 0 in the second case. This defines a partial order on the elements of E by p 0 for alla E A and (/3,a) > 0 for at least one a E A. Proof. Choose a /3 =E kaa which is maximal relative to the ordering. At least one of the ke > 0. We claim that all the kc, > 0. Indeed, suppose not. This partitions A into A1, the set of a for which kc, > 0 and A2, the set of a for which kc,= 0. Now the scalar product of any two distinct simple roots is < 0. (Recall that this followed from the fact that if (a1, a2) > 0, then s2(a1) =a1 - (a1, a2)a2 would be a root whose ai coefficient is positive and whose a2 coefficient is negative which is impossible.) In particular, all the (a1,a2) < 0, a1EAO1, a2 EA2 and so (3,a2) <_0, V a2 EA2. The irreducibility of A implies that (a1, a2) 0 for at least one pair a1 E A1, a2 E A2. But this scalar product must then be negative. So (#,a2) < 0  96 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. and hence S2/#= # - (/3, a2)a2 is a root with s/ -#>0 contradicting the maximality of 3. So we have proved that all the kc, are positive. Furthermore, this same argument shows that (3, a) ;> 0 for all a e A. Since the elements of A form a basis of E, at least one of the scalar products must not vanish, and so be positive. We have established the second and third items in the proposition for any maximal 3. We will now show that this maximal weight is unique. Suppose there were two, /3 and 3'. Write /3' = k'> where all the k' >0. Then (/3,#/3') > 0 since (3, a) ;> 0 for all a and > 0 for at least one. Since s/3' is a root, this would imply that /3 - 3'is a root, unless /3 =3'. But if /3 - /3'is a root, it is either positive or negative, contradicting the maximality of one or the other. QED Let us label the elements of A as a1,... , af, and let us set so that ao is the minimal root. From the second and third items in the propo- sition we know that ao + k1i1 + -.. + kfao = 0 (6.3) and that (ao,a) <;0 for all i and < 0 for some i. Let us take the left hand side (call it 'y) of (6.3) and successively compute (7, ai), i=0,1, . L..,. We obtain 2 (a1,ao) ... aao)\/ 1 \ ( ao, ai) 2 .. . (a, ai) k1 K . ... i.2 =0 This means that if we write the matrix on the left of this equation as 2I - A, then A is a matrix with 0 on the diagonal and whose i, j entry is -(aj, a2). So A a non-negative matrix with integer entries with the properties " if A23 #0 then A32 #0, " The diagonal entries of A are 0, * A is irreducible in the sense that we can not partition the indices into two non-empty subsets I and J such that A23 0 V i E I, j E J and * A has an eigenvector of eigenvalue 2 with all its entries positive.  6.3. GRAPHS. 97 We will show that the Perron-Frobenius theorem allows us to classify all such matrices. From here it is an easy matter to classify all irreducible root systems and then all simple Lie algebras. For this it is convenient to introduce the language of graph theory. 6.3 Graphs. An undirected graph F = (N, E) consists of a set N (for us finite) and a subset E of the set of subsets of N of cardinality two. We call elements of N "nodes" or "vertices" and the elements of E "edges". If e = {i, j} E E we say that the "edge" e joins the vertices i and j or that "i and j are adjacent". Notice that in this definition our edges are "undirected": {i, jI} {j, i}, and we do not allow self-loops. An example of a graph is the "cycle" A(' with £ + 1 vertices, so N = {0, 1, 2, ..., £} with 0 adjacent to £ and to 1, with 1 adjacent to 0 and to 2 etc. The adjacency matrix A of a graph F is the (symmetric) 0 - 1 matrix whose rows and columns are indexed by the elements of N and whose i, j-th entry A23 = 1 if i is adjacent to j and zero otherwise. For example, the adjacency matrix of the graph A3(' is 0 1 0 1 1 0 1 0 0 1 0 1~ 1 0 1 0 We can think of A as follows: Let V be the vector space with basis given by the nodes, so we can think of the i-th coordinate of a vector x E V as assigning the value x2 to the node i. Then y = Ax assigns to i the sum of the values xz summed over all nodes j adjacent to i. A path of length r is a sequence of nodes xi1, x22,... , xi, where each node is adjacent to the next. So, for example, the number of paths of length 2 joining i to j is the i, j-th entry in A2 and similarly, the number of paths of length r joining i to j is the i, j-th entry in A'. The graph is said to be connected if there is a path (of some length) joining every pair of vertices. In terms of the adjacency matrix, this means that for every i and j there is some r such that the i, j entry of AT is non-zero. In terms of the theory of non-negative matrices (see below) this says that the matrix A is irreducible. Notice that if 1 denotes the column vector all of whose entries are 1, then 1 is an eigenvector of the adjacency matrix of A(, with eigenvalue 2, and all the entries of 1 are positive. In view of the Perron-Frobenius theorem to be stated below, this implies that 2 is the maximum eigenvalue of this matrix. We modify the notion of the adjacency matrix as follows: We start with a connected graph F as before, but modify its adjacency matrix by replacing some of the ones that occur by positive integers a24. If, in this replacement ai3 > 1, we redraw the graph so that there is an arrow with a23 lines pointing towards  98 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. the node i. For example, the graph labeled Al in Table Aff 1 corresponds to the matrix 0 2 2 0 which clearly has as an positive eigenvector with eigenvalue 2. Similarly, diagram A2) in Table Aff 2 corresponds to the matrix 0 4 1 0 which has ()as eigenvector with eigenvalue 2. In the diagrams, the coefficient next to a node gives the coordinates of the eigenvector with eigenvalue 2, and it is immediate to check from the diagram that this is indeed an eigenvector with eigenvalue 2. For example, the 2 next to a node with an arrow pointing toward it in CC' satisfies 2 - 2 = 2 - 1 + 2 etc. It will follow from the Perron Frobenius theorem to be stated and proved below, that these are the only possible connected diagrams with maximal eigen- vector two. All the graphs so far have zeros along the diagonal. If we relax this condi- tion, and allow for any non-negative integer on the diagonal, then the only new possibilities are those given in Figure 4. Let us call a matrix symmetrizable if A23 0 - Ai2 # 0. The main result of this chapter will be to show that the lists in the Figures 1-4 exhaust all irre- ducible matrices with non-negative integer matrices, which are symmetrizable and have maximum eigenvalue 2. 6.4 Perron-Frobenius. We say that a real matrix T is non-negative (or positive) if all the entries of T are non-negative (or positive). We write T > 0 or T > 0. We will use these definitions primarily for square (n x n) matrices and for column vectors (n x 1) matrices. We let Q:= {xeR":x>0, xz# 0} so Q is the non-negative "orthant" excluding the origin. Also let C:= {x;>0: : |=1}. So C is the intersection of the orthant with the unit sphere. A non-negative matrix square T is called primitive if there is a k such that all the entries of Tk are positive. It is called irreducible if for any i, j there is a k =k(i, j) such that (Tk)23 > 0. For example, as mentioned above, the adjacency matrix of a connected graph is irreducible.  6.4. PERRON-FROBENI US. 1 " 1 ~1 99 1ll 1 2 0>- . 2 1 -. K __ 1 2 2 1 1 1 1 2 3 1 2 3 4 2 1 2 3 2 1 24 1 1, 2 3 4 3 2 1 2 1 2 3 45 64 2 3 B1£> 3 C(1) £ > 2 E61 7 E8 Figure 6.1: Aff 1.  100 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. 2 __1 "< 0 2 2 S 2) 2 1 A2' £> 2 1 2 1 1 1 - :"e---- 1 2 3 2 .- .<_4 2 1 1 1 1 E(2 Figure 6.2: Aff 2 1 2 1 0 4(3 Figure 6.3: Aff 3  6.4. PERRON-FROBENI US.10 101 1 1 1 2 2 1 1 1 1 2 2 1 L £ >1 LQ £> 1 LB £>I1 LD £ > 2 Figure 6.4: Loops allowed  102 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. If T is irreducible then I + T is primitive. In this section we will assume that T is non-negative and irreducible. Theorem 13 Perron-Frobenius. 1. T has a positive (real) eigenvalue Amax such that all other eigenvalues of T satisfy Al 0. 3. Any non-negative eigenvector is a multiple of x. 4. More generally, if y > 0, y # 0 is a vector and p is a number such that Ty py then y>0, and p;>Amax with p =Amax if and only if y is a multiple of x. 5. If 0 < S < T, S # T then every eigenvalue o of S satisfies U< )max 6. In particular, all the diagonal minors T(2) obtained from T by deleting the i-th row and column have eigenvalues all of which have absolute value < Amax- We will present a proof of this theorem after first showing how it classifies the possible connected diagrams with maximal eigenvalue two. But first let us clarify the meaning of the last two assertions of the theorem. The matrix T(2) is usually thought of as an (n -1) x (n - 1) matrix obtained by "striking out" the i-th row and column. But we can also consider the matrix T obtained from T by replacing the i-th row and column by all zeros. If x is an n-vector which is an eigenvector of Ti, then the n -1 vector y obtained from x by omitting the (0) i-th entry of x is then an eigenvector of T(2) with the same eigenvalue (unless the vector x only had non-zero entries in the i-th position). Conversely, if y is an eigenvector of T(2) then inserting 0 at the i-th position will give an n-vector which is an eigenvector of T with with the same eigenvalue as that of y. More generally, suppose that S is obtained from T by replacing a certain number of rows and the corresponding columns by all zeros. Then we may apply item 5) of the theorem to this n x n matrix, S, or the "compressed version" of S obtained by eliminating all these rows and columns. We will want to apply this to the following special case. A subgraph F' of a graph F is the graph obtained by eliminating some nodes, and all edges emanating from these nodes. Thus, if A is the adjacency matrix of F and A' is the adjacency matrix of A, then A' is obtained from A by striking out some rows and their corresponding columns. Thus if F is irreducible, so that we may  6.4. PERRON-FROBENIUS. 103 apply the Perron Frobenius theorem to A, and if F' is a proper subgraph (so we have actually deleted some rows and columns of A to obtain A'), then the maximum eigenvalue of A' is strictly less than the maximum eigenvalue of A' is strictly less than the maximum eigenvalue of A. Similarly, if an entry A23 is > 1, the matrix A' obtained from A by decreasing this entry while still keeping it positive will have a strictly smaller maximal eigenvalue. We now apply this theorem to conclude that the diagrams listed in Figures Aff 1, Aff2, and Aff 3 are all possible connected diagrams with maximal eigen- value two. A direct check shows that the vector whose coordinate at each node is the integer attached to that node given in the figure is an eigenvector with eigen- value 2. Perron-Frobenius then guarantees 2 is the maximal eigenvalue. But now that we have shown that for each of these diagrams the maximal eigenvalue is two, any "larger" diagram must have maximal eigenvalue strictly greater than two and any "smaller" diagram must have maximal eigenvalue strictly less than two. To get started, this argument shows that A(1) is the only diagram for which there is an i, j for which both a23 and a32 are > 1. Indeed, if A were such a matrix, by striking out all but the i and j rows and columns, we would obtain a two by two matrix whose off diagonal entries are both > 2. If there were strict inequality, the maximum eigenvalue of this matrix would have to be bigger than 2 (and hence also the original diagram) by Perron Frobenius. So other than AV), we may assume that if a23 > 1 then a32 = 1. Since any diagram with some entry aig > 4 must contain A2) we see that this is the only diagram with this property and with maximum eigenvalue 2. So other than this case, all a2 < 3. Diagram G21) shows that a diagram with only two vertices and a triple bond has maximum eigenvalue strictly less than 2, since it is contained in G21) as a subdiagram. So any diagram with a triple bond must have at least three vertices. But then it must "contain" either G21) or D(3). But as both of these have maximal eigenvalue 2, it can not strictly contain either. So G21) and D43).are the only possibilities with a triple bond. Since A , ,£> 2 is a cycle with maximum eigenvalue 2, no graph can contain a cycle without actually being a cycle, i.e. being A(. On the other hand, a simple chain with only single bonds is contained in A(, and so must have maximum eigenvalue strictly less than 2, So other than A(1, every candidate must contain at least one branch point or one double bond. If the graph contains two double bonds, there are three possibilities as to the mutual orientation of the arrows, they could point toward one another as in C( ,away from one another as in D 1 or in the same direction as in A22. But then these are the only possibilities for diagrams with two double bonds, as no diagram can strictly contain any of them. Also, striking off one end vertex of C1) yields a graph with one extreme vertex with a double bound, with the arrow pointing away from the vertex, and  104 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. no branch points. Striking out one of the two vertices at the end opposite the double bond in B(1) yields a graph with one extreme vertex with with a double bound and with the arrow pointing toward this vertex. So either diagram must have maximum eigenvalue < 2. Thus if there are no branch points, there must be at least one double bond and at least two vertices on either side of the double bond. The graph with exactly two vertices on either side is strictly contained in F(1) and so is excluded. So there must be at least three vertices on one side and two on the other of the double bond. But then F4 and E(2) exhaust the possibilities for one double bond and no branch points. If there is a double bond and a branch point then either the double bond points toward the branch, as in A2_ or away from the branch as in B(. But then these exhaust the possibilities for a diagram containing both a double bond and a branch point. If there are two branch points, the diagram must contain D( and hence must coincide with D(. So we are left with the task of analyzing the possibilities for diagrams with no double bonds and a single branch point. Let m denote the minimum number of vertices on some leg of a branch (excluding the branch point itself). If m> 2, then the diagram contains E6 and hence must coincide with E61). So we may assume that m = 1. If two branches have only one vertex emanating, then the diagram is strictly contained in D( and hence excluded. So each of the two other legs have at least two or more vertices. If both legs have more than two vertices on them, the graph must contain, and hence coincide with E1). We are left with the sole possibility that one of the legs emanating from the branch point has one vertex and a second leg has two vertices. But then either the graph contains or is contained in E()so E( is the only such possibility. We have completed the proof that the diagrams listed in Aff 1, Aff 2 and Aff 3 are the only diagrams without loops with maximum eigenvalue 2. If we allow loops, an easy extension of the above argument shows that the only new diagrams are the ones in the table "Loops allowed". 6.5 Classification of the irreducible A. Notice that if we remove a vertex labeled 1 (and the bonds emanating from it) from any of the diagrams in Aff 2 or Aff 3 we obtain a diagram which can also be obtained by removing a vertex labeled 1 from one of the diagrams inAff 1. (In the diagram so obtained we ignore the remaining labels.) Indeed, removing the right hand vertex labeled 1 from D43) yields A2 which is obtained from A21) by removing a vertex. Removing the left vertex marked 1 gives G2, the diagram obtained from GF1 by removing the vertex marked 1. Removing a vertex from 4A) gives A1. Removing the vertex labeled 1 from Af yields B22, obtained by removing one of the vertices labeled 1 from Bf.~  6.6. CLASSIFICATION OF THE IRREDUCIBLE ROOT SYSTEMS. 105 Removing a vertex labeled 1 from A _1 yields D2f or C2f, removing a vertex labeled 1 from D(2 yields Bf+1 and removing a vertex labeled 1 from EP2) yields F4 or C4. Thus all irreducible A correspond to graphs obtained by removing a vertex labeled 1 from the tableAff 1. So we have classified all possible Dynkin diagrams of all irreducible A . They are given in the table labeled Dynkin diagrams. 6.6 Classification of the irreducible root systems. It is useful to introduce here some notation due to Bourbaki: A subset 4 of a Euclidean space E is called a root system if the following axioms hold: " 4 is finite, spans E and does not contain 0. " If a E then the only multiples of a which are in 4 are ta. " If a E then the reflection s, in the hyperplane orthogonal to a sends 4 into itself. " If a,#3E 4PthenK(3,a) E Z, Recall that 2(#3,a) (ay) so that the reflection sc, is given by sa(3) =3 - (#, a)a. We have shown that each semi-simple Lie algebra gives rise to a root system, and derived properties of the root system. If we go back to the various arguments, we will find that most of them apply to a "general" root system according to the above definition. The one place where we used Lie algebra arguments directly, was in showing that if 3 ta is a root then the collection of j such that /3 + ja is a root forms an unbroken chain going from -r to q where r - q = (,3 a). For this we used the representation theory of sl(2). So we now pause to give an alternative proof of this fact based solely on that axioms above, and in the process derive some additional useful information about roots. For any two non-zero vectors a and 3 in E, the cosine of the angle between them is given by a||#cosO8 = (a,3). So (#3,a)=2 cosO0. a|| Interchanging the role of a and /3 and multiplying gives (/3, a)(a,3) = 4 cos2O.  106 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. .p p . . .. . -" A , £ > 1 B £> 2 " " - ( " " QC £> 2 Df £ >4 G2 F4 E6 S S S S 0 E7 S S S t " 0 Figure 6.5: Dynkin diagrams.  6.6. CLASSIFICATION OF THE IRREDUCIBLE ROOT SYSTEMS. 107 The right hand side is a non-negative integer between 0 and 4. So assuming that a +3 and / o > a The possibilities are listed in the following table: (a,3) (3,a) 0 O w /30 ||#||2/la2 0 0 7/2 undetermined 1 1 7/3 1 -1 -1 27/3 1 1 2 7/4 2 -1 -1 3x/4 2 1 3 7/6 3 -1 -3 57/6 3 Proposition 20 If a +3 and if(a,3) > 0 then a-3 is a root. If(a,3) < 0 then a +#/3 is a root. Proof. The second assertion follows from the first by replacing /3 by -/3. So we need to prove the first assertion. From the table, one or the other of (,3 a) or (a, /3) equals one. So either s,3 =3 - a is a root or s a =a - /3 is a root. But roots occur along with their negatives so in either event a - /3 is a root. QED Proposition 21 Suppose that a # +3 are roots. Let r be the largest integer such that /3 - ra is a root, and let q be the largest integer such that /3 + qa is a root. Then /3 + ia is a root for all -r < i < q. Furthermore r - q = (,3 a) so in particular q - r| 3. Proof. Suppose not. Then we can find a p and an s such that -r < p < s q such that /3 + pa is a root, but /3+ (p + 1)a is not a root, and /3+ sa is a root but /3+ (s - 1)a is not. The preceding proposition then implies that (3+ pa,a);> 0 while (3+ sa,oa) < 0 which is impossible since (a, a) = 2 > 0. Now s, adds a multiple of a to any root, and so preserves the string of roots /3 - ra,/3 - (r - 1)a, . . . ,/3+ qa. Furthermore sa (/ + iax) _=1#-((#, a) 0+0ica so sc reverses the order of the string. In particular sa(/3 + qa) =3 - ra. The left hand side is 3-((#,3 a)+q)a so r-q = (/3, a) as stated in the proposition. QED We can now apply all the preceding definitions and arguments to conclude that the Dynkin diagrams above classify all the irreducible bases A of root systems.  108 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. Since every root is conjugate to a simple root, we can use the Dynkin dia- grams to conclude that in an irreducible root system, either all roots have the same length (cases A, D, E) or there are two root lengths - the remaining cases. Furthermore, if 3 denotes a long root and a a short root, the ratios |3|2/la|2 are 2 in the cases B, C, and F4, and 3 for the case G2. Proposition 22 In an irreducible root system, the Weyl group W acts irre- ducibly on E. In particular, the W-orbit of any root spans E. Proof. Let E' be a proper invariant subspace. Let E" denote its orthogonal complement, so E =E' e E". For any root a, If e E E' then see = e - (e, a)a E E' . So either (e, a) = 0 for all e, and so a E E" or a e E'. Since the roots span, they can't all belong to the same subspace. This contradicts the irreducibility. QED Proposition 23 If there are two distinct root lengths in an irreducible root system, then all roots of the same length are conjugate under the Weyl group. Also, the maximal weight is long. Proof. Suppose that a and 3 have the same length. We can find a Weyl group element W such that w3 is not orthogonal to a by the preceding proposition. So we may assume that (,3 a) 0. Since a and /3 have the same length, by the table above we have (,3 a) =+1. Replacing /3 by -3= sg3 we may assume that (#3,a) = 1. Then (sgsos)(a) =(sOs,)(a -#1) = sO(-a-13+a) = sO(-3) /3. QED Let (E, P) and (E', V) be two root systems. We say that a linear map f : E - E' is an isomorphism from the root system (E, P) to the root system (E', V) if f is a linear isomorphism of E onto E' with f(4) =_P' and Kf(f3), f (a)) =K(,a) for all a,/ #3E'. Theorem 14 Let A = {a1,... ,aj} be a base of J. Suppose that (E', J') is a second root system with base A' {a',... ,al} and that Then the bijection  6.7. THE CLASSIFICATION OF THE POSSIBLE SIMPLE LIE ALGEBRAS.109 extends to a unique isomorphism f : (E, 1) - (E',J'). In other words, the Cartan matrix A of A determines 1 up to isomorphism. In particular, The Dynkin diagrams characterize all possible irreducible root systems. Proof. Since A is a basis of E and A' is a basis of E', the map a2 a-o extends to a unique linear isomorphism of E onto E'. The equality in the theorem implies that for a,# /3E A we have sf(a)f(3) = f (3) - (f(#),f(a))f(a) = f (sa3). Since the Weyl groups are generated by these simple reflections, this implies that the map w f a w af is an isomorphism of W onto W'. Every 3 E 4 is of the form w(a) where w E W and a is a simple root. Thus f(3) =f o0wof-1f(a) ELI' so f (4) =4P'. Since s (3) 3- (,3 a)a, the number (,3, a) is determined by the reflection se acting on /3. But then the corresponding formula for 1' together with the fact that sf(a) = f o sa o implies that f(f3), f (a)) =K(,a). QED 6.7 The classification of the possible simple Lie algebras. Suppose that g, h, is a pair consisting of a semi-simple Lie algebra g, and a Cartan subalgebra h. This determines the corresponding Euclidean space E and root system 1. Suppose we have a second such pair (g', h'). We would like to show that an isomorphism of (E, P) with (E', V) determines a Lie algebra isomorphism of g with g'. This would then imply that the Dynkin diagrams classify all possible simple Lie algebras. We would still be left with the problem of showing that the exceptional Lie algebras exist. We will defer this until Chapter VIII where we prove Serre's theorem with gives a direct construction of all the simple Lie algebras in terms of generators and relations determined by the Cartan matrix. We need a few preliminaries. Proposition 24 Every positive root can be written as a sum of simple roots in such a way that every partial sum is again a root.  110 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. Proof. By induction (on say the height) it is enough to prove that for every positive root 3 which is not simple, there is a simple root a such that 3 - a is a root. We can not have (3, a) < 0 for all a E A for this would imply that the set {3} U A is independent (by the same method that we used to prove that A was independent). So (3, a) > 0 for some a E A and so /3 - a is a root. Since 3 is not simple, its height is at least two, and so subtracting a will not be zero or a negative root, hence positive. QED Proposition 25 Let g, h be a semi-simple Lie algebra with a choice of Cartan subalgebra, Let 1 be the corresponding root system, and let A be a base. Then g is generated as a Lie algebra by the subspaces g., g_ ,a E A. From the representation theory of sl(2), we know that [ga, g0] = ga + if a +/3 is a root. Thus from the preceding proposition, we can successively obtain all the go for /3 positive by bracketing the go, a e A. Similarly we can get all the go for /3 negative from the ga. So we can get all the root spaces. But [ga, g_a] = Chc so we can get all of h. The decomposition g =h e g- 7Eh'* via complexification. Let f : h -> h' denote the corresponding isomorphism on the Cartan subalgebras obtained by identifying h and h' with their duals using the Killing form. Fix a base A of 4 and A' of V. Choose 0fxa e g,,a e A and 0# x'1, E ga,. Extend f to a linear map f : h e @ ga -> h' E ® ga, aEA a'EA' by f (xa) = xa,. Then f extends to a unique isomorphism of g -> g'. Proof. The uniqueness is easy. Given xa there is a unique yc, E g-a for which [xa, ya] =hc, so f, if it exists, is determined on the ya, and hence on all of g since the xa and yc, generate g be the preceding proposition.  6.7. THE CLASSIFICATION OF THE POSSIBLE SIMPLE LIE ALGEBRAS. 111 To prove the existence, we will construct the graph of this isomorphism. That is, we will construct a subalgebra k of g e g' whose projections onto the first and onto the second factor are isomorphisms: Use the x, and y, as above, with the corresponding elements x', and y', in g'. Let x:=z e 4', e g e g' and similarly define Ya:=ya e y',, and ha :=hae h'a,. Let 3 be the (unique) maximal root of g, and choose x E go. Make a similar choice of c' E g', where 3' is the maximal root of g'. Set x:cxexc'. Let m c g e g' be the subspace spanned by all the ad . .. ad p 2x. The element ad paq - -"-"ad pas zbelongs to go_ 1 a e jg' ,aso m n (go e g',) is one dimensional . In particular m is a proper subspace of g e g'. Let k denote the subalgebra of g e g' generated by the c, the y. and the hc. We claim that [k, m] C m. Indeed, it is enough to prove that m is invariant under the adjoint action of the generators of k. For the ad ya this follows from the definition. For the ad he we use the fact that [h, ya] =-a(h)yc, to move the ad ha past all the ady at the cost of introducing some scalar multiple, while ad ha =(3, a)xc + (3', a')x> =(, a)z because f is an isomorphism of root systems. Finally [zx, ya2] = 0 if ai a2 E A since ai - a2 is not a root. On the other hand [xc, ya] = ha. So we can move the ad Te past the ad y, at the expense of introducing an adha every time y = a. Now a + /3 is not a root, since /3 is the maximal root. So [zc, zc] =0. Thus adIzar 0, and we have proved that [k, m] c m. But since m is a proper subspace of g e g', this implies that k is a proper subalgebra, since otherwise m would be a proper ideal, and the only proper ideals in g e g' are g and g'.  112 CHAPTER 6. THE SIMPLE FINITE DIMENSIONAL ALGEBRAS. Now the subalgebra k can not contain any element of the form z e 0, z # 0, for it if did, it would have to contain all of the elements of the form u e 0 since we could repeatedly apply ad xz's until we reached the maximal root space and then get all of g e 0, which would mean that k would also contain all of 0 e(g' and hence all of g e g' which we know not to be the case. Similarly k can not contain any element of the form Oez'. So the projections of k onto g and onto g' are linear isomorphisms. By construction they are Lie algebra homomorphisms. Hence the inverse of the projection of k onto g followed by the projection of k onto g' is a Lie algebra isomorphism of g onto g'. By construction it sends x to x', and hc, to har and so is an extension of f. QED  Chapter 7 Cyclic highest weight modules. In this chapter, g will denote a semi-simple Lie algebra for which we have chosen a Cartan subalgebra, h and a base A for the roots 4 =4+ U 4-- of g. We will be interested in describing its finite dimensional irreducible repre- sentations. If W is a finite dimensional module for g, then h has at least one simultaneous eigenvector; that is there is a p E h* and a w / 0 e W such that hw = p(h)w V h E h. (7.1) The linear function p is called a weight and the vector v is called a weight vector. Ifx e ga, hxw =h, x]w + xhw = (Ap + a)(h)xw. This shows that the space of all vectors w satisfying an equation of the type (7.1) (for varying p) spans an invariant subspace. If W is irreducible, then the weight vectors (those satisfying an equation of the type (7.1)) must span all of W. Furthermore, since W is finite dimensional, there must be a vector v and a linear function A such that hoy=vA(h)v V h E h, esv= 0, Va E 4+. (7.2) Using irreducibility again, we conclude that W =U(g)v. The module is cyclic generated by v. In fact we can be more precise: Let h1,..., h be the basis of h corresponding to the choice of simple roots, let e2 E gat, f2 E g-a where a1,... , am are all the positive roots. (We can choose them so that each e and f generate a little sl(2).) Then g=n_ e he n+, 113  114 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. where ei,. ..., em is a basis of n+, where hi,. ..., hf is a basis of h, and fi, ... , fm is a basis of n_. The Poincar6-Birkhoff-Witt theorem says that monomials of the form f 1 ---.fm h1 ---hie ---..ek Q 1 m form a basis of U(g). Here we have chosen to place all the e's to the extreme right, with the h's in the middle and the f's to the left. It now follows that the elements f l -l-.-.f mav span W. Every such element, if non-zero, is a weight vector with weight A - (iia1+- +imam). Recall that p -KA means that A -/p=Zkiai, a2 > 0, where the k2 are non-negative integers. We have shown that every weight p of W satisfies So we make the definition: A cyclic highest weight module for g is a module (not necessarily finite dimensional) which has a vector v+ such that x+v+=0, Vx + En+, hv+=A(h)v+Vheh and V =U(g)v+- In any such cyclic highest weight module every submodule is a direct sum of its weight spaces (by van der Monde). The weight spaces V all satisfy At and we have V = ®V. Any proper submodule can not contain the highest weight vector, and so the sum of two proper submodules is again a proper submodule. Hence any such V has a unique maximal submodule and hence a unique irreducible quotient. The quotient of any highest weight module by an invariant submodule, if not zero, is again a cyclic highest weight module with the same highest weight. 7.1 Verma modules. There is a "biggest" cyclic highest weight module, associated with any A E h* called the Verma module. It is defined as follows: Let us set b := hen+.  7.2. WHEN IS DIM IRR(A) < oo? 115 Given any A E h* let CA denote the one dimensional vector space C with basis z+ and with the action of b given by (h + E3 )z+ :=A(h)z+. '3s0 So it is a left U(b) module. By the Poincard Birkhoff Witt theorem, U(g) is a free right U(b) module with basis {f1 -.. frn}, and so we can form the Verma module Verm(A) := U(g) ®U(b) CA which is a cyclic module with highest weight vector v+ := 1 0 z+. Furthermore, any two irreducible cyclic highest weight modules with the same highest weight are isomorphic. Indeed, if V and W are two such with highest weight vector v+, u+, consider V e W which has (v+, tu+) as a maximal weight vector with weight A, and hence Z := U(g)(v+, u+) is cyclic and of highest weight A. Projections onto the first and second factors give non-zero homomorphisms which must be surjective. But Z has a unique irreducible quotient. Hence these must induce isomorphisms on this quotient, V and W are isomorphic. Hence, up to isomorphism, there is a unique irreducible cyclic highest weight module with highest weight A. We call it Irr(A). In short, we have constructed a "largest" highest weight module Verm(A) and a "smallest" highest weight module Irr(A). 7.2 When is dim Irr(A) 0 so that A lies in the closure of the fundamental Weyl chamber. Such a weight is called dominant. So a necessary condition for Irr(A) to be finite dimensional is that A be dominant integral. We now show that conversely, Irr(A) is finite dimensional whenever A is dominant integral. For this we recall that in the universal enveloping algebra U(g) we have  116 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. 2. [hi, f +1] -(k + 1)ay(h )fk+1 3. [e, f'+]= -(k + 1)ff (k - 1 - h2) where the first two equations are consequences of the fact that ad is a derivation and [e2, fy] = 0 if ifj since a2 - as is not a root and [hy,fy] =-ag (hi) fj. The last is a the fact about sl(2) which we have proved in Chapter II. Notice that it follows from 1.) that ej(ff)v+= 0 for all k and all i j and from 3.) that eif<(hi)+lv± 0 so that f,(h2)+1v+ is a maximal weight vector. If it were non-zero, the cyclic module it generates would be a proper submodule of Irr(A) contradicting the irreducibility. Hence f(hi)1v+ = 0. So for each i the subspace spanned by v+, fiv+,', (h)v+ is a finite dimen- sional sl(2)2 module. In particular Irr(A) contains some finite dimensional sl(2), modules. Let V' denote the sum of all such. If W is a finite dimensional sl(2)2 module, then eW is again finite dimensional, thus so their sum, which is a finite dimensional sl(2), module. Hence V' is g- stable, hence all of Irr(A). In particular, the e2 and the f2 act as locally nilpotent operators on Irr(A). So the operators Ti (exp e2)(exp -f2)(exp e2) are well defined and ri(Irr(A)), = Irr()s so dim Irr(A)u, = dim Irr(A)), Vw E W (7.3) where W denotes the Weyl group. These are all finite dimensional subspaces: Indeed their dimension is at most the corresponding dimension in the Verma module Verm(A), since Irr(A), is a quotient space of Verm(A),. But Verm(A),u has a basis consisting of those fi1 -fm -- f4v+. The number of such elements is the number of ways of writing A - p = k1a1 +---kmam. So dim Verm(A), is the number of m-tuplets of non-negative integers (ki, ... , km) such that the above equation holds. This number is clearly finite, and is known as PK(A - p), the Kostant partition function of A - p, which will play a central role in what follows. Now every element of E is conjugate under W to an element of the closure of the fundamental Weyl chamber, i.e. to a y satisfying (pt,ai) >0  7.3. THE VALUE OF THE CASIMIR. 117 i.e. to a p that is dominant. We claim that there are only finitely many dominant weights p which are - A, which will complete the proof of finite dimensionality. Indeed, the sum of two dominant weights is dominant, so A + p is dominant. On the other hand, A - p = E kia2 with the k > 0. So (A, A) - (pu) (A+p,A-t) =Zki(A+p,ai) >0. So p lies in the intersection of the ball of radius (A, A) with the discrete set of weights - A which is finite. We record a consequence of (7.3) which is useful under very special circum- stances. Suppose we are given a finite dimensional representation of g with the property that each weight space is one dimensional and all weights are conju- gate under W. Then this representation must be irreducible. For example, take g = sl (n + 1) and consider the representation of g on Ak(Cn+l), 1 < k 0 a>0 a<0 This expression for Cask has all the n+ elements moved to the right; in partic- ular, all of the summands in the last two sums annihilate v. Hence ry(Cas') =Zhiki + Z[x, z] i a>O and XA(Cash) S A(hi)A(ki) + 5:A([xz, zc]). a>O For any h e h we have s(h, [z, zc]) = ([h, zc], zc) = a(h)k(za, zc) = (h) so  7.3. THE VALUE OF THE CASIMIR. 119 where t, E h is uniquely determined by i (tom, h) = a(h) V h E h. Let ( , ), denote the bilinear form on h* obtained from the identification of h with h* given by K. Then SA([x, za]) = A(ta) = (A, a), = 2(A, p)h (7.6) >Oc>O c>O where 1 a>o On the other hand, let the constants a2 be defined by A(h)= ai(hi, h) V h ECh. In other words A corresponds to a2h2 under the isomorphism of h with h* so (A, A)h aiaj(h, hj). Since K(h2, k3) = 62j we have A(k2) = a2. Combined with A(h2) => ass(hy, h ) this gives (A, A)K A(hi)A(k2). (7.7) Combined with (7.6) this yields XA(Cask) = (A + p, A + p), - (p, p),. (7.8) We now use this innocuous looking formula to prove the following: We let L = Lg c h* denote the lattice of integral linear forms on h, i.e. Le=Eh*2 ' EZV#EA}. (7.9) L is called the weight lattice of g. For p, A E L recall that if A - p is a sum of positive roots. Then  120 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. Proposition 26 Any cyclic highest weight module Z(A), A E L has a composi- tion series whose quotients are irreducible modules, Irr(p) where p -< A satisfies (p +p, p+p)h=(A+p,A+p),. (7.10) In fact, if d Z= dim Z(A), where the sum is over all p satisfying (7.10) then there are at most d steps in the composition series. Remark. There are only finitely many p E L satisfying (7.10) since the set of all p satisfying (7.10) is compact and L is discrete. Each weight is of finite multiplicity. Therefore d is finite. Proof by induction on d. We first show that if d = 1 then Z(A) is irreducible. Indeed, if not, any proper submodule W, being the sum of its weight spaces, must have a highest weight vector with highest weight p, say. But then XA(Cask) = x,(Cash) since W is a submodule of Z(A) and Cask takes on the constant value X(Cask) on Z(A). Thus p and A both satisfy (7.10) contradicting the assumption d = 1. In general, suppose that Z(A) is not irreducible, so has a submodule, W and quotient module Z(A)/W. Each of these is a cyclic highest weight module, and we have a corresponding composition series on each factor. In particular, d = dw + dz(A)/w so that the d's are strictly smaller for the submodule and the quotient module. Hence we can apply induction. QED For each A E L we introduce a formal symbol, e(A) which we want to think of as an "exponential" and so the symbols are multiplied according to the rule e(p) - e(v) = e(p + v). (7.11) The character of a module N is defined as chN =Edim N, - e(p). In all cases we will consider (cyclic highest weight modules and the like) all these dimensions will be finite, so the coefficients are well defined, but (in the case of Verma modules for example) there may be infinitely many terms in the (formal) sum. Logically, such a formal sum is nothing other than a function on L giving the "coefficient" of each e(p). In the case that N is finite dimensional, the above sum is finite. If f = fe(p) and g= gve(v) are two finite sums, then their product (using the rule (7.11)) corresponds to convolution:  7.4. THE WEYL CHARACTER FORMULA. 121 where (f *g) := f~gv. +v=A So we let Zfin(L) denote the set of Z valued functions on L which vanish outside a finite set. It is a commutative ring under convolution, and we will oscillate in notation between writing an element of Zfin(L) as an "exponential sum" thinking of it as a function of finite support. Since we also want to consider infinite sums such as the characters of Verma modules, we enlarge the space Zfin(L) by defining Zgen(L) to consist of Z val- ued functions whose supports are contained in finite unions of sets of the form A - E kaa. The convolution of two functions belonging to Zgen(L) is well defined, and belongs to Zgen(L). So Zgen(L) is again a ring. It now follows from Prop.26 that chz(A) 5chrr( ) where the sum is over the finitely many terms in the composition series. In particular, we can apply this to Z(A) = Verm(A), the Verma module. Let us order the pi - A satisfying (7.10) in such a way that pi - pj -> i K j. Then for each of the finitely many pi occurring we get a corresponding formula for chverm(us) and so we get collection of equations chverm(qs) ai jch 1rr(ps) where a22 = 1 and i < j in the sum. We can invert this upper triangular matrix and therefore conclude that there is a formula of the form chjrr(A) 5 b(p)chverm() (7.12) where the sum is over p - A satisfying (7.10) with coefficients b(p) that we shall soon determine. But we do know that b(A) = 1. 7.4 The Weyl character formula. We will now prove Proposition 27 The non-zero coefficients in (7.12) occur only when w(A +p) -p where w E W, the Weyl group of g, and then Here (-1)W : det w.  122 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. We will prove this by proving some combinatorial facts about multiplication of sums of exponentials. We recall our notation: For A E h*, Irr(A) denotes the unique irreducible module of highest weight, A,and Verm(A) denotes the Verma module of highest weight A, and more generally, Z(A) denotes an arbitrary cyclic module of highest weight A. Also 1 is one half the sum of the positive roots. Let Ai, i = 1, ..., dim h be the basis of the weight lattice, L dual to the base A. So Ai (ha ) = ( AiI aj) = ogg. Since si(ai) = -aj while keeping all the other positive roots positive, we saw that this implied that sip p - a2 and therefore (p, ai) = 1, i = 1, . . . , X, := dim(h). In other words p # = A1 +---+ Ar. (7.13) The Kostant partition function, PK(p) is defined as the number of sets of non-negative integers, k# such that (The value is zero if p can not be expressed as a sum of positive roots.) For any module N and any p E h*, Nf, denotes the weight space of weight p. For example, in the Verma module, Verm(A), the only non-zero weight spaces are the ones where pA= A - ZeDE +3 and the multiplicity of this weight space, i.e. the dimension of Verm(A), is the number of ways of expressing in this fashion, i.e. dim Verm(A), = PK(A - p). (7.14) In terms of the character notation introduced in the preceding section we can write this as chverm(A) PK(A - )e(p). To be consistent with Humphreys' notation, define the Kostant function p by p(v) = PK(-v) and then in succinct language chverm(A)= p( - A). (7 (7.15)  7.4. THE WEYL CHARACTER FORMULA. 123 Observe that if f = f(p)e(p) then f -e(A) = f(p)e(A + p) = f(v - A)e(v). We can express this by saying that f - e(A) = f(--A). Thus, for example, chVerm(A) =p. - A) p= p 'e(A). Also observe that if 1 f - .= 1+ e(-a) + e(-2a) + ... then (1 - e(-a))f6, 1 and I fa = P aE(D+ by the definition of the Kostant function. Define the function q by q := f (e(a/2) - e(-a/2)) = e(p) 11(1 - e(-a)) aE'D+ since e(p) = ]JE+ e(a/2). Notice that wq = (-1)wq. It is enough to check this on fundamental reflections, but they have the property that they make exactly one positive root negative, hence change the sign of q. We have qp = e(p). (7.16) Indeed, qpe(-p) (1 - e(-a)) e(p)pe(-p) (1 - e(-a)) p - (1-e(-a))Hff 1. Therefore, qchverm(A) =qpe(A) =e(p)e(A) =e(A + p).  124 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. Let us now multiply both sides of (7.12) by q and use the preceding equation. We obtain qchjrr(A) Z b(p)e( + p) where the sum is over all p - A satisfying (7.10), and the b(p) are coefficients we must determine. Now ch 1rr(A) is invariant under the Weyl group W, and q transforms by (-1)w. Hence if we apply w E W to the preceding equation we obtain (-1)"qchrr(A) = b(p)e(w(p + p)). This shows that the set of p + p with non-zero coefficients is stable under W and the coefficients transform by the sign representation for each W orbit. In particular, each element of the form At= w(A+p) -p has (-1)w as its coefficient. We can thus write qchv(A) = (-1)we(w(A + p)) + R w EW where R is a sum of terms corresponding to p + p which are not of the form w(A + p). We claim that there are no such terms and hence R = 0. Indeed, if there were such a term, the transformation properties under W would demand that there be such a term with p + p in the closure of the Weyl chamber, i.e. S+ p e A :=L n D where D=Dg={AEE(A,5);>0 V# eA+} and E = hR denotes the space of real linear combinations of the roots. But we claim that ptp= A. Indeed, writep= A -7, w7=E kaa, k, > 0 so 0 (A+p,A+p)-(At+p-,A+p 7 (A + p, A) + (, + p) (A +p,wr) since p+ p e A > 0 since A + p E A and in fact lies in the interior of D. But the last inequality is strict unless 7w= 0. Hence 7w= 0. We will have occasion to use this type of argument several times again in the future. In any event we have derived the fundamental formula qchIrr(A) 5 (-1)we(w(A + p)). (7.17) wC w  7.5. THE WEYL DIMENSION FORMULA. 125 Notice that if we take A = 0 and so the trivial representation with character 1 for V(A), (7.17) becomes q Z= (-1)we(wp) and this is precisely the denominator in the Weyl character formula: WCF chjrr(A) Z= wW 1)we(w(A + p)(7.18) Zww(-1)we(wp) 7.5 The Weyl dimension formula. For any weight, p we define S :=(-1)"e(wp). w EW Then we can write the Weyl character formula as chjrr(A)= Ap For any weight p define the homomorphism T from the ring Zfin(L) into the ring of formal power series in one variable t by the formula I'(e(v)) = e(VA)'t (and extend linearly). The left hand side of the Weyl character formula belongs to Zfin(L), and hence so does the right hand side which is a quotient of two elements of Zfin(L). Therefore for any p we have W,(ch~rr(A)) = - (pa (p ,(Av) = v(A ) (7.19) for any pair of weights. Indeed, Tp(Av) _ (-1)"e(9,wv) t W W  126 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. In particular, p( Ax ) (Ap) a(q) "a(H(e(a/2) - e(-a/2))) 11(e(A)t/12 - e(~)t2 (A, a)k) t* + terms of higher degree in t. Hence Ip(ch=rr(A)) =I =(AA±+) _ H&\ +po + terms of positive degree in t. p (Ap) |o()p, a>, Now consider the composite homomorphism: first apply W1 and then set t = 0. This has the effect of replacing every e(p) by the constant 1. Hence applied to the left hand side of the Weyl character formula this gives the dimension of the representation Irr(A). The previous equation shows that when this composite homomorphism is applied to the right hand side of the Weyl character formula, we get the right hand side of the Weyl dimension formula: dim Irr(A) H aE (A+p,a) .(7.20) I(a)E1J+ (p, a)K 7.6 The Kostant multiplicity formula. Let us multiply the fundamental equation (7.17) by pe(-p) and use the fact (7.16) that qpe(-p) = 1 to obtain chjrr(A) S (-1)wpe(-p)e(w(A + p)). w EW But pe(-p)e(w(A + p)) = p(- - w(A + p) + p) or, in more pedestrian terms, the left hand side of this equation has, as its coefficient of e(p) the value p( + p - w(A + p)). On the other hand, by definition, chjrr(A) =1dim(Irr(A),e(p). We thus obtain Kostant's formula for the multiplicity of a weight p in the irreducible module with highest weight A: KMF dim (Irr(A))-5 (-1)Wp(p + p - w(A + p)). (7.21) wC Ew  7.7. STEINBERG'S FORMULA. 127 It will be convenient to introduce some notation which simplifies the appearance of the Kostant multiplicity formula: For w E W and p E L (or in E for that matter) define w®p:= w( p+p)-p. (7.22) This defines another action of W on E where the "origin of the orthogonal transformations w has been shifted from 0 to -p". Then we can rewrite the Kostant multiplicity formula as dim(Irr(A)), = (-1)"PK(w O A - p) (7.23) w EW or as ch(Irr(A)) 5 S(-1)wPK(w O A - )e(p), (7.24) wCW I where PK is the original Kostant partition function. For the purposes of the next section it will be useful to record the following lemma: Lemma 14 If v is a dominant weight and e w E W then w O v is not dominant. Proof. If v is dominant, so lies in the closure of the positive Weyl chamber, then v + p lies in the interior of the positive Weyl chamber. Hence if w f e, then w(v + p)(h2) < 0 for some i, and so w O v = w(v + p) - p is not dominant. QED 7.7 Steinberg's formula. Suppose that A' and A" are dominant integral weights. Decompose Irr(A') ® Irr(A") into irreducibles, and let n(A) = n(A, A' ® A") denote the multiplicity of Irr(A) in this decomposition into irreducibles (with n(A) = 0 if Irr(A) does not appear as a summand in the decomposition). In particular, n(v) = 0 if v is not a dominant weight since Irr(v) is infinite dimensional in this case, so can not appear as a summand in the decomposition. In terms of characters, we have ch(Irr(A')) ch(Irr(A")) 5 1n(A) ch(Irr(A)). A Steinberg's formula is a formula for rn(A). To derive it, use the Weyl character formula ch(Irr(A")) c =AA+p in the above formula to obtain ch(Irr(A'))AA"+p = n(A)AAp-  128 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. Use the Kostant multiplicity formula (7.24) for A': ch(Irr(A')) _ Z(-1)"PK(w ® A' - P)e(p) wCw I and the definition A A"+p = (-1)"e(u(A" + p)) UCW and the similar expression for AA+p to get S: S:(-1)"PK(w OA' - p))e(u(A" + p) + p) u,wEW S: S:n(A)(-1)we(w(A + p)). A W Let us make a change of variables on the right hand side, writing so the right hand side becomes S5(-1)"n(w-1 O v)e(v + p). V 'W If v is a dominant weight, then by Lemma 14 w-1Ov is not dominant if w-1 e. So n(w-1 O v) = 0 if w + 1 and so the coefficient of e(v + p) is precisely n(v) when v is dominant. On the left hand side let p=v -U 0 A" to obtain (-1)""PK(w A + u 0 A" - v)e(v + p). Comparing coefficients for v dominant gives n(v) =S(-1)""PK(w 0 A' + u 0 A" - v). (7.25) U ,W 7.8 The Freudenthal - de Vries formula. We return to the study of a semi-simple Lie algebra g and get a refinement of the Weyl dimension formula by looking at the next order term in the expansion we used to derive the Weyl dimension formula from the Weyl character formula. By definition, the Killing form restricted to the Cartan subalgebra h is given by  7.8. THE FREUDENTHAL - DE VRIES FORMULA. 129 where the sum is over all roots. If p, A E h* with t,-, t, the elements of H corresponding to them under the Killing form, we have (A, p)k = rs(tA, t ) = (ta)a(t ) so (A, p) = (A, a ), )..(7.26) For each A in the weight lattice L we have let e(A) denote the "formal exponential" so Zfin(L) is the space spanned by the e(A) and we have defined the homomorphism Tp : Zfin(A) -> C[[t]], e(A) H-e(.>),t Let N and D be the images under [p of the Weyl numerator and denominator. So N =p(Ap+,) =p+a(Ap) by (7.19) and A=q =7 (e/2 - e--a/2 (7.27) and therefore N(t) J (e(A+p,a)t/2 _e-(+p,a)t/2 a>O H A +pa)t[1+ (A+ p,2a)t2+...] 24 with a similar formula for D. Then N/D -> d(A) = the dimension of the representation as t -> 0 is the usual proof (that we reproduced above) of the Weyl dimension formula. Sticking this in to N/D gives N d( (1+1Z[(+p,a)Z - +... D =dA +24 a>0 For any weight p we have (p, t), =E(p, a)2 by (7.26), where the sum is over all roots so N= d1+ 1(+ A+p - t2(48p[( h t2+(,.), and we recognize the coefficient of4at2 in the above expression as x(Cash)), the scalar giving the value of the Casimir associated to the Killing form in the representation with highest weight A. On the other hand, the image under '1 of the character of the irreducible representation with highest weight A is e( ,p),t- \ (1 +(, p) t + (p, p)2 - - -.)  130 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. where the sum is over all weights in the irreducible representation counted with multiplicity. Comparing coefficients gives /11 (p2p!= 24d(A xCask) Applied to the adjoint representation the left hand side becomes (p, p)h by (7.26), while d(A) is the dimension of the Lie algebra. On the other hand, XA(Cas") = 1 since tr ad(Cash) = dim(g) by the definition of Cask. So we get 1 (p, p)= dim g (7.28) 24 for any semisimple Lie algebra g. An algebra which is the direct sum a commutative Lie and a semi-simple Lie algebra is called reductive. The previous result of Freudenthal and deVries has been generalized by Kostant from a semi-simple Lie algebra to all reductive Lie algebras: Suppose that g is merely reductive, and that we have chosen a symmtric bilinear form on g which is invariant under the adjoint representation, and denote the associated Casimir element by Casg. We claim that (7.28) generalizes to 1 24tr ad(Casg) (p, p). (7.29) (Notice that if g is semisimple and we take our symmetric bilinear form to be the Killing form ( , ), (7.29) becomes (7.28).) To prove (7.29) observe that both sides decompose into sums as we decompose g into as sum of its center and its simple ideals, since this must be an orthogonal decomposition for our invariant scalar product. The contribution of the center is zero on both sides, so we are reduced to proving (7.29) for a simple algebra. Then our symmetric biinear form ( , ) must be a scalar multiple of the Killing form: ( , )=c2( ) for some non-zero scalar c. If Z1,... , ZN is an orthonormal basis of g for ( , )k then z1/c, ... , ZN/c is an orthonormal basis for ( , ). Thus 1 Casg = 2 Caskh. So 1 1 1 tr ad(Casg) =2tr ad(Cas) = 2 dim g. ( ) c2 c2 24 But on h* we have the dual relation 1 Combining the last two equations shows that (7.29) becomes (7.28). Notice that the same proof shows that we can generalize (7.8) as XA(Cas) = (A + p, A + p) - (p, p) (7.30) valid for any reductive Lie algebra equipped with a symmetric bilinear form invariant under the adjoint representation.  7.9. FUNDAMENTAL REPRESENTATIONS. 131 7.9 Fundamental representations. We let w2 denote the weight which satisfies w (h ) = oss so that the w2 form an integral basis of L and are dominant. We call these the basic weights.If (V, p) and (W, o) are two finite dimensional irreducible representations with highest weights A and o, then V ® W, p 0 o contains the irreducible representation with highest weight A + p, and highest weight vector v, ® w1 , the tensor product of the highest weight vectors in V and W. Tak- ing this "highest" component in the tensor product is known as the Cartan product of the two irreducible representations. Let (Vi, p2) be the irreducible representations corresponding to the basic weight w2. Then every finite dimensional irreducible representation of g can be obtained by Cartan products from these, and for that reason they are called the fundamental representations. For the case of An = sl (n +1) we have already verified that the fundamental representations are Ak (V) where V = Cn+1 and where the basic weights are w2 = L1 + -.-. + Li We now sketch the results for the other classical simple algebras, leaving the details as an exercise in the use of the Weyl dimension formula. For Ci.= sp(2n) it is immediate to check that these same expressions give the basic weights. However while V = C2n A1A(V) is irreducible, the higher order exterior powers are not: Indeed, the symplectic form Q E A2 (V*) is preserved, and hence so is the the map A3 (V) ' A3--2(Vy) given by contraction by Q. It is easy to check that the image of this map is surjective (for j = 2, ... , n). the kernel is thus an invariant subspace of dimension 2n 2n j 2j - 2) and a (not completely trivial) application of the Weyl dimension formula will show that these are indeed the dimensions of the irreducible representations with highest weight w. Thus these kernels are the fundamental representations of C,. Here are some of the details: We have p=wl+ -+Wn=Z(nI-i+1)Lj. The most general dominant weight is of the form Zkiw =a1L1 + --. + anLn  132 CHAPTER 7. CYCLIC HIGHEST WEIGHT MODULES. where a1=k1+---+kn, a2=k2+---+kn,---an=kn where the k, are non-negative integers. So we can equally well use any decreasing sequence a1 a2 > -" --> an 0 of integers to parameterize the irreducible representations. We have (p, Li - Lj) j-i, (p, Li +Lj) = 2n+2 -i - j. Multiplying these all together gives the denominator in the Weyl dimension formula. Similarly the numerator becomes H(li - lj) H (li + lj ) i 2. In applying the preceding formula, all of the terms with 2 < i are the same as for the trivial representation, as is r1 - r2. The ratios of the remaining factors to those of the trivial representation are n . n . n . j=3 j3 j3 coming from the ri - rj terms, i= 1, 2. Similarly the ri + rj terms give a factor 2n +1 n2n +2- j 22-1H 2n - j  7.10. EQUAL RANK SUBGROUPS. 133 and the terms r1 + 1, r2 + 1 contribute a factor 12+1 In multiplying all of these terms together there is a huge cancellation and what is left for the dimension of this fundamental representation is (2n + 1)(2n - 2) 2 Notice that this equals 2 -1 =dim A2 V-. More generally this dimension argument will show that the fundamental repre- sentations are the kernels of the contraction maps i(Q) : Ak - (V) Ak-2 (V) where Q is the symplectic form. For Bn it is easy to check that w2 := L + - - - + Li (i n2- 1), andw n 2 (L1 + - - - + Ln) are the basic weights and the Weyl dimension formula gives 2n + 1 the value . for j n - 1 as the dimensions of the irreducibles with these weight, so that they are A3 (V), j =1, ... n - 1 while the dimension of the irreducible corresponding to wn is 2'. This is the spin representation which we will study later. Finally, for D0 = o(2n) the basic weights are og = L1 + - -+ Lj, jf, z H- h. So consider a vector space with basis v1,.... . , v and let A be the tensor algebra over this vector space. We drop the tensor product signs in the algebra, so write v v - -i : vi ® - - v.  8.2. THE FIRST FIVE RELATIONS. 139 for any finite sequence of integers with values from 1 to £. We make A into an f module as follows: We let the Z2 act as derivations of A, determined by its actions on generators by Zil = 0, Zjv2 (aga %)v3. So if we define cij := (a2, aj we have Zg(vi -... ve,) (c1 + . .. + citj) (v21 . . . vt The action of the Zi is diagonal in this basis, so their actions commute. We let the Y act by left multiplication by vi. So Yjvi1 ...vi, := v vi ...vi, and hence [Zi,Y] =-c2Y = -(agj, a )Y as desired. We now want to define the action of the Xi so that the relations analogous to (8.2) and (8.3) hold. Since Zil = 0 these relations will hold when applied to the element 1 if we set X31=0 Vj and X v =0 Vij. Suppose we define Xj(vpvq) -6jpCqjvq. Then while ZiXj(vpvq) =-6jpCqjCqiVq -cgiX (vpvq) -(cpi + ci)Xj (vjvq). Xj Zi(vpvq) 6jpcqj (cpi + cqi)vq Thus [Zi,Xj](vpvq) = cjiXj (vpvq) as desired. In general, define Xi (ops-... -vp) := vp1 (Xi (v2 ... vP6)) - 61(c 2 + ... + cPOg)(vp2 ... Vp) (8.8) for t > 2. We claim that ZiXi (vp1 ... vp) - (Cpl i + ... + Cpt 2 - Cj2) xj (Vp ... vpt ) - Indeed, we have verified this for the case t = 2. By induction, we may assume that Xj (vp2 . -Vp,) is an eigenvector of Z2 with eigenvalue cp2+ +"-"- - -c-  140 CHAPTER 8. SERRE'S THEOREM. cgi. Multiplying this on the left by vP1 produces the first term on the right of (8.8). On the other hand, this multiplication produces an eigenvector of Z2 with eigenvalue cp12 + - - - + cpt2 - cjZ. As for the second term on the right of (8.8), if j # P1 it does not appear. If j = pi then cp12+- - -+cpt2-cg = cp2i+- - -+cyt2. So in either case, the right hand side of (8.8) is an eigenvector of Z2 with eigenvalue cpi Z+ .. + cy2 - c2. But then [Zi,X] K=(aj, ai)Xj as desired. We have defined an action of f on A whose kernel contains I, hence descends to an action of m on A. Let #5: m - End A denote this action. Suppose that z: aizi + -"-- + a zf for some complex numbers a1, ... , af and that #(z) = 0. The operator #(z) has eigenvalues - Sa3 c~j when acting on the subspace V of A. All of these must be zero. But the Cartan matrix is non-singular. Hence all the a2= 0. This shows that the space spanned by the zZ is in fact £-dimensional and spans an £-dimensional abelian subalgebra of m. Call this subalgebra z. Now consider the 3f-dimensional subspace of f spanned by the XZ, Y and Z, i = 1, ... , £. We wish to show that it projects onto a 3? dimensional subspace of m under the natural passage to the quotient f - m = f/i. The image of this subspace is spanned by xi, y2 and z2. Since # (xi) # 0 and 5(yZ) # 0 we know that xi + 0 and y, # 0. Suppose we had a linear relation of the form aizi + biyi+z =0. Choose some z' E z such that a (z') + 0 and a (z') # a (z') for any i f j. This is possible since the a2 are all linearly independent. Bracketing the above equation by z' gives a(z')aixi - a (z')biyi = 0 by the relations (8.4) and (8.5). Repeated bracketing by z' and using the van der Monde (or induction) argument shows that a2 = 0, b2 = 0 and hence that z =0. We have proved that the elements xz, yj, zk in m are linearly independent. The element [x 1i, [xi2, [-.-.-[zi, 1, zi] .-.-.-J]]] is an eigenvector of zZ with eigenvalue cili +...+ cit. For any pair of elements p and A of z* (or of h*) recall that  8.2. THE FIRST FIVE RELATIONS. 141 denotes the fact that A - p = E kiai where the k2 are all non-negative integers. For any A E z* let m, denote the set of all m E m satisfying [z, m] =A(z)m Vz e z. Then we have shown that the subalgebra x of m generated by X1,. . , x is contained in m+ : m,. Similarly, the subalgebra y of m generated by the yi lies in m_ :®9m,\. A- 2. Thus the subspace y + z + x is closed under ad yi and hence under any product of these operators. Similarly for ad xi. Since these generate the algebra m we see that y + z + x = m and hence x =m+ and y=m_. We have shown that m= m_ e z e m+ where z is an abelian subalgebra of dimension £, where the subalgebra m+ is generated by x1, ... . , xo, where the subalgebra m_ is generated by Y1, ... , yf, and where the 3? elements x1, .. . , x, y1 ... , y , .. . , zf are linearly independent. There is a further property of m which we want to use in the next section in the proof of Serre's theorem. For all i j between 1 and £ define the elements xig and Yij by zig := (ad xi)-cii+1(xj), Yij := (ad yi)-ii+1(yj). Conditions (8.6) and (8.7) amount to setting these elements, and hence the ideal that they generate equal to zero. We claim that for all k and all i j between 1 and £ we have ad Xk(Yij) = 0 (8.9) and adYk(Xij)=0. (8.10)  142 CHAPTER 8. SERRE'S THEOREM. By symmetry, it is enough to prove the first of these equations. If k # i then [xk, yi] = 0 by (8.3) and hence ad Xk(Yij) (ad yi)--i+1[x, y]= (ad yi)--i+16kjhj by (8.2) and (8.3). If k # j this is zero. If k j we can write this as (ad y2-i)-i(ad yi)h= (ad y2) --icjy. If cig = 0 there is nothing to prove. If cg #0 then chi #0 and in fact is strictly negative since the angles between all elements of a base are obtuse. But then (ad yi)-ciyi = 0. It remains to consider the case where k = i. The algebra generated by x2, y,z is isomorphic to sl(2) with [x2, y2] = z2, [z2, x] = 2x2, [z2, y2] = -2y2. We have a decomposition of m into weight spaces for all of z, in particular into weight spaces for this little sl(2). Now [xi, yj] = 0 (from (8.3)) so y3 is a maximal weight vector for this sl(2) with weight -cgi and (8.9) is just a standard property of a maximal weight module for sl(2) with non-negative integer maximal weight. 8.3 Proof of Serre's theorem. Let k be the ideal of m generated by the x2 and Yij as defined above. We wish to show that g =m/k is a semi-simple Lie algebra with Cartan subalgebra h = z/k and root system - 0 while n_ is a sum of weight spaces of h with A - 0. We have to see which weight spaces survive the passage to the quotient. The sl(2) generated by x2, yi, z2 is not sent into zero by the projection of m onto g since z2 is not sent into zero. Since sl(2) is simple, this means that the projection map is an isomorphism when restricted to this sl(2). Let us denote the images of x2, yi, z2 by e2, f2, h2. Thus g is generated by the 3? elements e1,...,e , f1,..., ff, 1,..., heQ and all the axioms (8.1)-(8.7) are satisfied. We must show that g is finite dimensional, semi-simple, and has 4 as its root system. First observe that ad e2 acts nilpotently on each of the generators of the algebra g, and hence acts locally nilpotently on all of g. Similarly for ad f2. Hence the automorphism i := (exp ad e2)(ad - f2)(exp ad e2) is well defined on all of g. So if s2 denotes the reflection in the Weyl group W corresponding to i, we have T(gA) = gs2A. Notice that each of the ma, is finite dimensional, since the dimension of m, for A >- 0 is at most the number of ways to write A as a sum of successive a2, each such sum corresponding to the element [xz1, [zi2, [... , xi,] ... ]. (In particular mka =f{0} for k > 1.) Similarly for A - 0. So it follows that each of the gx, is finite dimensional, that dim g, = dim g. Vw E W and that gkA =0for k #-1,0,1. Furthermore, g, is one dimensional, and since every root is conjugate to a simple root, we conclude that dimg =l1 Va E Q. We now show that gA = {0} for)A#f 0, A 4P. Indeed, suppose that gx # {0}. We know that A is not a multiple of a for any a E 4, since we know this to be true for simple roots, and the dimensions of the gx are invariant under the Weyl group, each root being conjugate to a simple root. So A' does not coincide with any hyperplane orthogonal to any root. So we can find a p E A' such that (a, p) #0 for all roots. We may find a w E W which maps p into the positive Weyl chamber for A so that (al, p) > 0 and hence (a,wp) > 0 for i = 1,..., L. Now dim g,, = dim g,  144 CHAPTER 8. SERRE'S THEOREM. and for the latter to be non-zero, we must have wA =Z:kai with the coefficients all non-negative or non-positive integers. But 0 = (A, p) =_(wA,wp) =Zki(ai, p) with (ai, p) > 0 Vi. Hence all the k2 = 0. So dim g = f + Card 4. We conclude the proof if we show that g is semi-simple, i.e. contains no abelian ideals. So suppose that a is an abelian ideal. Since a is an ideal, it is stable under h and hence decomposes into weight spaces. If g, na / {0}, then g, c a and hence [ga, g,] c a and hence the entire sl(2) generated by g, and g_, is contained in a which is impossible since a is abelian and sl(2) is simple. So a c h. But then a must be annihilated by all the roots, which implies that a = {0} since the roots span h*. QED 8.4 The existence of the exceptional root sys- tems. The idea of the construction is as follows. For each Dynkin diagram we will chose a lattice IL in a Euclidean space V, and then let 1 consist of all vectors in this lattice having all the same length, or having one of two prescribed lengths. We then check that 2(a1,a2) E Va1,a2 4. (a1, a1) This implies that reflection through the hyperplane orthogonal to ai1 preserves IL, and since reflections preserve length that these reflection s preserve 1. This will show that 4 is a root system and then calculation shows that it is of the desired type. (G2). Let V be the plane in R3 consisting of all vectors y with \z x + y + z =0. Let IL be the intersection of the three dimensional standard lattice Z2 with V. Let L1, L2, L3 denote the standard basis of R3. Let 1 consist of all vectors in IL of squared length 2 or 6. So 4 consists of the six short vectors (L1 -Lj) z Ap, x'->xl  9.1. DEFINITION AND BASIC PROPERTIES 149 where 1 E Aop under the identification of A~p with the ground field. The element xl on the extreme right means the image of 1 under the action of x E C(p). For elements vi,.. . , Vg E p this map sends V1 H V1 V1 V2 viA v2 +(vi, v2)1 v1v2v3 1vi A V2 A v3 + (v1, v2)v3 - (v1, v3)v2 + (v2, V3)V1 V1V2V3V4 viA V2 AV3 A v4 + (v2, v3)vi1A v4- (v2, v4)vi1A v3 +(v3, V4)v1 A v2 + (vi, v2)v3 A V4 - (V1, V3) 1 A V4 +(v, V4) V2 A v3 + (v1,v4)(v2,v3) - (v1,v3)(v2,v4) + (v1,v2)(v3,v4) If the v's form an "orthonormal" basis of p then the products vi1 -.-.-Vi , i1 < i2 -.-. < ik, k = 0, 1, ..., n form a basis of C(p) while the vi1 A.- -A vik, i1t<< i2-.-.-< 1ik, k =0,1,..., n form a basis of Ap, and in fact V1 . VkF-> v1A -..-AVk if (v2,v ) = 0 Vi / j. (9.1) In particular, the map C(p) - Ap given above is an isomorphism of vector spaces, so we may identify C(p) with Ap as a vector space if we choose, and then consider that Ap has two products: the Clifford product which we denote by juxtaposition and the exterior product which we denote with a A. Notice that this identification preserves the Z/2Z gradation, an even element of the Clifford algebra is identified with an even element of the exterior algebra and an odd element is identified with an odd element. 9.1.5 The canonical antiautomorphism. The Clifford algebra has a canonical anti-automorphism a which is the identity map on p. In particular, for v2 E p we have a(viv2) = v2v1, a(viv2v3) - v3v2v1, etc. By abuse of language, we use the same letter a to denote the similar anti- automorphism on Ap and observe from the above computations (in particular from the corresponding choice of bases) that a commutes with our identifying map C(p) -> Ap so the notation is consistent. We have a = (-1)k(k1)id on Ak(p).  150CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. For small values of k we have k(-1)k*(k-1) 0 1 1 1 2 -1 3 -1 4 1 5 1 6 -1. We will use subscripts to denote the homogeneous components of elements of Ap. Notice that if u E A2p then au = -u by the above table, while a(u2) = (au)2 2. Since u2 is even (and hence has only even homogeneous components) and since the maximum degree of the homogeneous component of u2 is 4, we conclude that =2 (2)0 + (U2)4 V u E A2p. (9.2) For the same reason v2 = (v2)0 + (v2)4 V v E A3p. (9.3) We also claim the following: (ww')o = (aw, w') =_(-l)k(k-1)(w wI) V w,w' E Ak (p). (9.4) Indeed, it is sufficient to verify this for w, w' belonging to a basis of Ap, say the basis given by all elements of the form (9.1), in which case both sides of (9.4) vanish unless w = w'. If w = w' = v1 A - - - A vk (say) then (ww)o = t(v1) ... t(vk)v1 A ... A Vk - (-1) k(k-1)(v1, v1) - -.- (vk, og) = (-1) k 1)(,- proving (9.4). As special cases that we will use later on, observe that (un')o = -(u, u') V u, u' E A2p (9.5) and (v')o = -(v, v') V v, v' E A3p. (9.6) 9.1.6 Commutator by an element of p. For any y E p consider the linear map w H [y, w] = yw - (1)kwy for w E Akp  9.1. DEFINITION AND BASIC PROPERTIES 151 which is (anti)commutator in the Clifford multiplication by y. We claim that [y, w] = 2t(y)w. (9.7) In particular, [y, .], which is automatically a derivation for the Clifford multi- plication, is also a derivation for the exterior multiplication. Alternatively, this equation says that t(y), which is a derivation for the exterior algebra multipli- cation, is also a derivation for the Clifford multiplication. To prove (9.7) write wy = a(ya(w)). Then yw = y A w + t(y)w, wy = a(y A a(w)) + a(t(y)aw) = w A y + (at(y)a)w. We may assume that w E Akp. Then y A w - (-l)kw A y =0, so we must show that at(y)aw = (-1)k"-146=w For this we may assume that y / 0 and we may write w =u A z + z', where t(y)u = 1 and t(y)z = t(y)z' = 0. In fact, we may assume that z and z' are sums of products of linear elements all of which are orthogonal to y. Then t(y)az = t(y)az' = 0 so t(y)aw = (-1)k-laz since z has degree one less than w and hence at(y)aw = (-1)k-iz = (-1)k-it(y)w. QED 9.1.7 Commutator by an element of /2p. Suppose that u E A2p. Then for y E p we have [u,y] -[y, u] = -2t(y)u. (9.8) In particular, if u = yj A Yj where yi, Yj E p we have [u,y] = 2(yj, y)yi - 2(yi, y)yj V y E p. (9.9) If (y2, y3) = 0 this is an "infinitesimal rotation" in the plane spanned by Yi and Yj. Since yj A Yj, i < j form a basis of A2p if Y1, ... , y, form an "orthonormal" basis of p, we see that the map u - [u, -]  152 CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. gives an isomorphism of A2p with the orthogonal algebra o(p). This identifi- cation differs by a factor of two from the identification that we had been using earlier. Now each element of o(p) (in fact any linear transformation on p) induces a derivation of Ap. We claim that under the above identification of A2p with o(p), the derivation corresponding to u E A2p is Clifford commutation by u. In symbols, if Ou denotes this induced derivation, we claim that Ou(w) = [u, w] = uw - wu V w E Ap. (9.10) To verify this, it is enough to check it on basis elements of the form (9.1), and hence by the derivation property for each vj, where this reduces to (9.8). We can now be more explicit about the degree four component of the Clifford square of an element of A2p, i.e. the element (u2)4 occurring on the right of (9.2). We claim that for any three elements y, y', y" E p 1 - t (y") t ')t(y) U2 = (yAy', u)t(y")u+ (y' Ay", u)t(y)u+ (y"A y, u)t(y')u. (9.11) 2 To prove this observe that t(y)u2 = (t(y)u) u + u (t(y)u) t(y.')t (y) U2 = (t(y')t(y)u) u - t(y)ut(y')u + t(y')ut(y)u + ut(y')t(y)u 2 ((y A y', u)u+ t(y')u A t(y)u) 1 S(y)(y) =(y A y', )t(y")u + t(y")t(y')u A t(y)u - t(y')u A t(y")t(y)u 2 (yAy',u)t(y")u+ (y' A y", u)t(y)u + (y" A y, u)t(y')u as required. We can also be explicit about the degree zero component of u2. Indeed, it follows from (9.9) that if u = yi A yj, i < j where y1, . . . , yn form an "orthonor- mal" basis of p then tr(adp U)2 = -8(yi, yi)(yj, yj), where adp u denotes the (commutator) action of u on p under our identification of A2p with o(p). But (yj A yj, yi A y3j) = (y, yi)(yj, y )(= +1). So using (9.5) we see that (vi2)o = -tr(adpvi)2 = (vi,v) (9.12) for u E A2p.  9.2. ORTHOGONAL ACTION OF A LIE ALGEBRA. 153 9.2 Orthogonal action of a Lie algebra. Let r be a Lie algebra. Suppose that we have a representation of r acting as infinitesimal orthogonal transformations of p which means, in view of the identification of A2p with o(p) that we have a map v :r - A2p such that x - y -2t(y)v(x) (9.13) where x- y denotes the action of x E r on y E p. 9.2.1 Expression for v in terms of dual bases. It will be useful for us to write equation (9.13) in terms of a basis. So let y1, . . , yn be a basis of p and let z1,. . . , z, be the dual basis relative to ( , )p. We claim that v(x) =-1 y A (x - z ). (9.14) j Indeed, it suffices to verify (9.13) for each of the elements zi. Now tz 1 11 S--jx.zi+ 1Z(Zi,X. Zj)yj. But (zi, z - zg) =-(x - zi, zy) since x acts as an infinitesimal orthogonal transformation relative to ( , ). So we can write the sum as 1 _ 1 _ 1 Zj,x~zj)yj - -Z (x "zi, zj)yj 1- "z yielding 1 1 which is (9.13). 9.2.2 The adjoint action of a reductive Lie algebra. For future use we record here a special case of (9.14): Suppose that p = r = g is a reductive Lie algebra with an invariant symmetric bilinear form, and the action is the adjoint action, i.e. x -y =[x, y]. Let h be a Cartan subalgebra of g  154CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. and let 4 denote the set of roots and suppose that we have chosen root vectors e4, e_ #0E 4so that (ee,e_4) =1. Let h1,... , h8 be a basis of h and k1,... k8 the dual basis. Let ': g A2g be the map v when applied to this adjoint action. Then (9.14) becomes (x)vI hiA[ki, x] + Ze-4 A[e4, x]) . (9.15) i=1 4EG In case x= h E h this formula simplifies. The [ki, h] = 0, and in the second sum we have e_4 A [ee, h] = -#(h)e_4 A c4 which is invariant under the interchange of # and -#. So let us make a choice 4+ of positive roots. Then we can write (9.15) as (h)=-1 (h)eAe4, heh. (9.16) Now e_4 A ec4= -1+ e_4e4. So if p.: (9.17) is one half the sum of the positive roots we have (h) = p(h) - #(h)eee, h e h. (9.18) In this equation, the multiplication on the right is in the Clifford algebra. 9.3 The spin representations. If P P1 e P2 is a direct sum decomposition of a vector space p with a symmetric bilinear form into two orthogonal subspaces then it follows from the definition of the Clifford algebra that C(p) =C(p1) ® C(p2)  9.3. THE SPIN REPRESENTATIONS. 155 where the multiplication on the tensor product is taken in the sense of superal- gebras, that is (a1 0 a2)(b1 0 b2) a1b1 0 a2b2 if either a2 or bi are even, but (a1 0 a2)(b1 0 b2) -a1b1 0 a2b2 if both a2 and b1 are odd. It costs a sign to move one odd symbol past another. 9.3.1 The even dimensional case. Suppose that p is even dimensional. If the metric is split (which is always the case if the metric is non-degenerate and we are over the complex numbers) then p is a direct sum of two dimensional mutually orthogonal split spaces, W2, so let us examine first the case of a two dimensional split space p, spanned by t, E with (t, t) = (E, c) = 0, (t, c) 1= . Let T be a one dimensional space with basis t and consider the linear map of p - End (AT) determined by E i E(t), tG t(t*) where c(t) denotes exterior multiplication by t and t(t*) denotes interior multi- plication by t*, the dual element to t in T*. This is a Clifford map since eft)2 = 0 = t(t*)2, e(t)t(t*) + t(t*)e(t) = id. This therefore extends to a map of C(p) - End(AT). Explicitly, if we use 1 E A°T, t E A'T as a basis of AT this map is given by 1 1 0 1 0 1 0 1 0 0 1 0 100 This shows that the map is an isomorphism. If now p =W1 D..e --eWm is a direct sum of two dimensional split spaces, and we write T=T1 ie-e T  156 CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. where the C(W) e End(AT) as above, then since AT = ATi® .-. ®- ATm we see that C(p) e End (AT). In particular, C(p) is isomorphic to the full 2m x 2m matrix algebra and hence has a unique (up to isomorphism) irreducible module. One model of this is S=AT. We can write S = S+ S- as a supervector space, where we choose the standard Z2 grading on AT to determine the grading on S if m is even, but use the opposite grading (for reasons which will become apparent in a moment) if m is odd. The even part, Co(p) of C(p) acts irreducibly on each of S+. Since A2p together with the constants generates Co(p) we see that the action of A2p on each of Si is irreducible. Since A2p under Clifford commutation is isomorphic to o(p) the two modules S± give irreducible modules for the even orthogonal algebra o(p). These are the half spin representations of the even orthogonal algebras. We can identify S = S+ S- as a left ideal in C(p) as follows: Suppose that we write P= P+E p- where p± are complementary isotropic subspaces. Choose a basis e..... , em of p+ and let e+:= el -...e = el A ...-A em E Amp+- We have y+e+ = O, V y+ E p+ and hence (Ap±)±e±= 0. In other words Ap+e+ consists of all scalar multiples of e+. Since Ap_ 0 Ap+ - C(p), w- ® w+ H -+ is a linear bijection, we see that C(p)e+_= Ap-e+. This means that the left ideal generated by e+ in 0(p) has dimension 2m, and hence must be isomorphic as a left 0(p) module to S. In particular it is a minimal left ideal.  9.3. THE SPIN REPRESENTATIONS. 157 Let e--, ... e-, be a basis of p_ and for any subset J {ii,. . . , ij}, i1 < i2 -" < i of {1,... .,m} let e A :e AA- e2 =e-.- e27. Then the elements eJe+ form a basis of this model of S as J ranges over all subsets of {1,... , m}. For example, suppose that we have a commutative Lie algebra h acting on p as infinitesimal isometries, so as to preserve each p±, that the e2 are weight vectors corresponding to weights 3 and that the e; form the dual basis, corresponding to the negative of these weights -32. Then it follows from (9.14) that the image, v(h) E A2(p) C C(p) of an element h e h is given by v(h) = Z/i(h)et A e;- = Zi(h)(1 - e-e). Thus v(h)e+ = pp(h)e+ (9.19) where 1 pp.=1 2(31+..+1m)"(9.20) 2 For a subset J of {1, ... , m} let us set 13J:E 3j. j-J Then we have [v(h), e_] =-#3J(h)el_ and so v(h)(e_ e+) = [v(h), ei']e+ + eJv(h)e+ = (pp(h) - 3J(h))eJ-e+- So if we denote the action of v(h) on S± by Spin1 v(h) and the action of v(h) on S = S+ S- by Spin v(h) we have proved that The eJe+ are weight vectors of Spin v with weights pp - 3J. (9.21) It follows from (9.21) that the difference of the characters of Spin+v and Spinv is given by chSin+v - chspin (e( #i) - e(- 3))= e(pp) J7J(1 - e(-3j)). (9.22) There are two special cases which are of particular importance: First, this applies to the case where we take h to be a Cartan subalgebra of o(p) =o(C2k)  158 CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. itself, say the diagonal matrices in the block decomposition of o(p) given by the decomposition C2k = Ck e Ck into two isotropic subspaces. In this case the /3 is just the i-th diagonal entry and (9.22) yields the standard formula for the difference of the characters of the spin representations of the even orthogonal algebras. A second very important case is where we take h to be the Cartan subalgebra of a semi-simple Lie algebra g, and take p :=n+ e n_ relative to a choice of positive roots. Then the /3 are just the positive roots, and we see that the right hand side of (9.22) is just the Weyl denominator, the denominator occurring in the Weyl character formula. This means that we can write the Weyl character formula as ch(Irr(A) 0 S+) - ch(Irr(A) 0 S_) =3(-1)we(w " A) wCW where we A =w(A + p). If we let U, denote the one dimensional module for h given by the weight p we can drop the characters from the preceding equation and simply write the Weyl character formula as an equation in virtual representations of h: Irr(A) ® S+ - Irr(A) 0 S_ =_ 3 (-1)wUw.A. (9.23) wCW The reader can now go back to the preceding chapter and to Theorem 16 where this version of the Weyl character formula has been generalized from the Cartan subalgebra to the case of a reductive subalgebra of equal rank. In the next chapter we shall see the meaning of this generalization in terms of the Kostant Dirac operator. 9.3.2 The odd dimensional case. Since every odd dimensional space with a non-singular bilinear form can be written as a sum of a one dimensional space and an even dimensional space (both non-degenerate), we need only look at the Clifford algebra of a one dimensional space with a basis element x such that (x, x) = 1 (since we are over the complex numbers). This Clifford algebra is two dimensional, spanned by 1 and x with x2 = 1, the element x being odd. This algebra clearly has itself as a canonical module under left multiplication and is irreducible as a Z/2Z module. We may call this the spin representation of Clifford algebra of a one dimensional space. Under the even part of the Clifford algebra (i.e. under the scalars) it splits into two isomorphic (one dimensional) spaces corresponding to the basis 1, x of  9.3. THE SPIN REPRESENTATIONS. 159 the Clifford algebra. Relative to this basis 1, x we have the left multiplication representation given by 1 1 0 _ 0 1 1 0 1 ' z ( 1 0 ' Let us use C(C) to denote the Clifford algebra of the one dimensional or- thogonal vector space just described, and S(C) its canonical module. Then if q=peC is an orthogonal decomposition of an odd dimensional vector space into a direct sum of an even dimensional space and a one dimensional space (both non- degenerate) we have C(q) ^ C(p) 0C(C) End(S(q)) where S(q) := S(p) 0 S(C) all tensor products being taken in the sense of superalgebra. We have a decom- position S(q) = S+(q) e S-(q) as a super vector space where S+(q) = S+(p) e+xS_ (p), S_ (q) = S_ (p) e+xS+(p). These two spaces are equivalent and irreducible as Co (q) modules. Since the even part of the Clifford algebra is generated by A2q together with the scalars, we see that either of these spaces is a model for the irreducible spin representa- tion of o(q) in this odd dimensional case. Consider the decomposition p = p+ e p- that we used to construct a model for S(p) as being the left ideal in C(p) generated by Amp+ where m = dim p+. We have A(C e p_) =_A(C) 0 Ap, and Proposition 28 The left ideal in the Clifford algebra generated by Amp+ is a model for the spin representation. Notice that this description is valid for both the even and the odd dimensional case. 9.3.3 Spin ad and V. We want to consider the following situation: g is a simple Lie algebra and we take ( , ) to be the Killing form. We have CD:g - A2 g c 0(g)  160CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. which is the map v associated to the adjoint representation of g. Let h be a Cartan subalgebra and 4 the collection of roots. We choose root vectors eo,# E 4Pso that (ee,e~e) =1. Then it follows from (9.14) that 4(x)=11:h2 A [ki, zg+ e-4 A [e4, zx (9.24) where the brackets are the Lie brackets of g, where the h2 range over a basis of h and the k2 over a dual basis. This equation simplifies in the special cases where x= h E h and in the case where x= ep, @ E 4+ relative to a choice, 4+ of positive roots. In the case that x = h E h we have seen that [k2, h] = 0 and the equation simplifies to (h) = p(h)1- 3 #(h)eee4 (9.25) where is one half the sum of then positive roots. We claim that for b E + we have 4P (eg) _=1zy/e'(9.26) where the sum is over pairs (y', /') such that either 1. 7' =0, )' = andxzy' E hor 2. y' E 4, 2/' E P+ and y' +2>'=2, and x, E gy'. To see this, first observe that this first sum on the right of (9.24) gives 5 $/(k )hi A eo and so all these summands are of the form 1). For each summand e_4 A [ee, eP] of the second sum, we may assume that either # + @)= 0 or that # + E E 4 for otherwise [ee, eP] = 0. If # + L = 0, so =-# 0, we have [ee, ep] E h which is orthogonal to e_4 since # # 0. So e_4 A [ep, eg,]=-[,ee again has the form 1).  9.3. THE SPIN REPRESENTATIONS. 161 If # + =T #0 is a root, then (e_4, eT) = 0 since # # T. If T E 1+ then e_4 A [ee, e] = e4yOT, where YT is a multiple of eT so this summand is of the form 2). If T is a negative root, the # must be a negative root so -# is a positive root, and we can switch the order of the factors in the preceding expression at the expense of introducing a sign. So again this is of the form 2), completing the proof of (9.26). Let n+ be the subalgebra of g generated by the positive root vectors and similarly n_ the subalgebra generated by the negative root vectors so g=n+ eb_, b_ :=n_Eh is an h stable decomposition of g into a direct sum of the nilradical and its opposite Borel subalgebra. Let N be the number of positive roots and let 0# n EAN±. Clearly yn=0 VyEn+. Hence by (9.26) we have D(n+)n= 0 while by (9.25) @(h)n=p(h)r V h E h. This implies that the cyclic module 4P(U(g))n is a model for the irreducible representation V, of g with highest weight p. Left multiplication by @(x), x E g gives the action of g on this module. Furthermore, if nc # 0 for some c E C(g) then nc has the same property: I(n+)nc = 0, (h)nc=p(h)nc, V h E h. Thus every nc # 0 also generates a g module isomorphic to V. Now the map An+ ® Ab_ - C(g), x ® b -xb is a linear isomorphism and right Clifford multiplication of ANn+ by Any is just ANn±, all the elements of of A+n+ yielding 0. So we have the vector space isomorphism nC(g) =_ANn± 0 Ab. In other words, P(U(g))nC(g)  162 CHAPTER 9. CLIFFORD ALGEBRAS AND SPIN REPRESENTATIONS. is a direct sum of irreducible modules all isomorphic to V, with multiplicity equal to dim Ab_ = 2s+N where s = dim h and N = dim n_ = dim n+. Let us compute the dimension of V, using the Weyl dimension formula which asserts that for any irreducible finite dimensional representation VA with highest weight A we have dim VA_ =. H (p, #) If we plug in A = p we see that each factor in the numerator is twice the corresponding factor in the denominator so dim V, = 2N. (9.27) But then dim 4(U(g))nC(g) = 2s+2N = dim C(g). This implies that C(g) = (U(g))nC(g) = (U(g))r(Ab_), (9.28) proving that C(g) is primary of type V, with multiplicity 2s+N as a represen- tation of g under the left multiplication action of 1(g). This implies that any submodule for this action, in particular any left ideal of C(g), is primary of type V. Since we have realized the spin representation of C(g) as a left ideal in C(g) we have proved the important Theorem 17 Spin ad is primary of type V. One consequence of this theorem is the following: Proposition 29 The weights of V are p - #1 (9.29) where J ranges over subsets of the positive roots and each occurring with multiplicity equal to the number of subsets J yielding the same value of #J. Indeed, (9.21) gives the weights of Spin ad, but several of the /3J are equal due to the trivial action of ad(h) on itself. However this contribution to the multiplicity of each weight occurring in (9.21) is the same, and hence is equal to the multiplicity of V, in Spin ad. So each weight vector of V, must be of the form (9.29) each occurring with the multiplicity given in the proposition.  Chapter 10 The Kostant Dirac operator Let p be a vector space with a non-degenerate symmetric bilinear form. We have the Clifford algebra C(p) and the identification of o(p) = A2(p) inside C(p). 10.1 Antisymmetric trilinear forms. Let # be an antisymmetric trilinear form on p. Then # defines an antisymmetric map b b:pp - p by the formula (b(y,y'),y") =(y,y',y") V y,y',y" E p. This bilinear map "leaves ( , ) invariant" in the sense that (b(y,y'),y") =_(y,b(y',y")). Conversely, any antisymmetric map b : p 0 p - p satisfying this condition defines an antisymmetric form #. Finally either of these two objects defines an element v E A3p by -2(v, y A y' A y") = (b(y, y'), y") =(y, y', y"). (10.1) We can write this relation in several alternative ways: Since -2(v, y A y' A y") -2(t(y')t(y)v, y") = 2(t(y)t(y')v, y") we have b(y, y') = 2t(y)t(y')v. (10.2) Also, t(y)v E A2p and so is identified with an element of o(p) by commutator in the Cliford algebra: ad(t(y)v)(y') = [t(y)v, y']=-2ty')t(y)v 163  164 CHAPTER 10. THE KOSTANT DIRAC OPERATOR so ad(t(y)v)(y') = [t(y)v, y'] = b(y, y'). (10.3) 10.2 Jacobi and Clifford. Given an antisymmetric bilinear map b : p 0 p - p we may define Jac(b) :p p ® p - p by Jac(b) (y, y', y") = b(b(y, y'), y") + b(b(y', y"), y) + b(b(y", y), y') so that the vanishing of Jac(b) is the usual Jacobi identity. It is easy to check that Jac(b) is antisymmetric and that if b satisfies (b(y, y'), y") = (y, b(y', y")) then the four form Myy', y", y'"' (Jac(b)(y, y', y"), y"') is antisymmetric. We claim that if v E A3p as in the preceding subsection, then 1 t(y")My')y)v2 = - Jac(b)(y, y', y"). 2 (10.4) To prove this observe that t(y )v2 t(y') t(y)v2 t(y") t(y') t(y)v2 (t(y)v)v - v(t(y)v) (t(y') t(y)v)v + (t(y)v) (t (y')v) - (t(y')v) (t (y)v) + v(t (y') t (y)v) -(t (y') t (y)v) t(y")v + (t(y")v) (t(y')t(y)v) + (t(y")t(y)v) t(y')v +(t(y)v) (t(y")t(y')v - (t(y")t(y')v) (t(y)v) - (t(y')v) (t(y")t(y)v)) [t(y")v, t(y')t(y)v] + [t(y')v, t(y)t(y")v] + [t(y)v, t(y')t(y")v] 1 - Jac(b) (y, y', y") 2 by (10.2) and (10.3). Equation (10.4) describes the degree four component of v2 in terms of Jac(b). We can be explicit about the degree zero component of v2. We claim that v2)0 1 n = 24 tr [y Ejb(yj, b(yj, y))], Ej (yy). j=1 (10.5) Indeed, by (9.6) we know that (v2)o -(v, v) and since y1 A y3 A Yk, i < j < k  10.3. ORTHOGONAL EXTENSION OF A LIE ALGEBRA. 165 form an "orthonormal" basis of A3p we have -(v,v) - S +(v,yi AyjAyA)2, +-eEiEjk 1YiYj 1 +1 ~YiYj ®0YIYj i~j 1 + (yiyj] 2 yj2i) iZy. 1= Casg 01 we conclude that (I1)2 + (K/ff)2 + diag(Casr) = Casg 01+ 1 t dCs -t d ar11+21 j i0 V# E@+} and Dr={A Ehl(A,5)>0 V#E4Pr} so D C Dr and we have chosen a cross-section C of Wr in W as C= {w e W wD c Dr},  10.6. EIGENVALUES OF THE DIRAC OPERATOR. 173 so W =Wr - C, Dr = U WD. wEC We let L = Lg c hR denote the lattice of g integral linear forms on h, i.e. L={pE h*2 )E Z V#E A}. We let and 1 We set Lr= the lattice spanned by L and Pr, and A:=LnD, Ar:=LrnDr. For any r module Z we let F(Z) denote its set of weights, and we shall assume that F(Z) C Lr. For such a representation define mz max (7+pr,7+pr). (10.20) TECF(Z) For any p E Ar we let Z, denote the irreducible module with highest weight p. Proposition 30 Let Fmax(Z) :={pE F(Z)|(p + pr, p +Pr) = mz}. Let p e Emax(Z). Then 1. e Ar. 2. If z f 0 is a weight vector with weight p then z is a highest weight vector, and hence the submodule U(r)z is irreducible and equivalent to Z1-. 3. Let Ymax :Z E(Z) and Y :=U(r)Ymax. Then mz - (Pr, Pr) is the maximal eigenvalue of Casr on Z and Y is the corresponding eigenspace.  174 CHAPTER 10. THE KOSTANT DIRAC OPERATOR Proof. We first show that SE Fmax P+pr EAr. Suppose not, so there exists a w f 1, w E Wr such that wp + WPr E Ar. But w changes the sign of some of the positive roots (the number of such changes being equal the length of w in terms of the generating reflections), and so Pr -wpr is a non-trivial sum of positive roots. Therefore (wp +wpr, Pr -wpr) ;>0, (pr -WPr, Pr -wpr) > 0 and wp + Pr = (wp + WPr) + (Pr - WPr) satisfies (W/i+Pr,W/i+Pr) > (W/i+WPr,W/i+WPr) (pt+pr,Ap+p) = mz contradicting the definition of mz. Now suppose that z is a weight vector with weight p which is not a highest weight vector. Then there will be some irreducible component of Z containing z and having some weight p' such that p' - p is a non trivial sum of positive roots. We have A'+ pr = (At' - A)+(A + Pr) so by the same argument we conclude that (p'+pr,Ap'+pr) > mz since p + Pr E Ar, and again this is impossible. Hence z is a highest weight vector implying that p E Ar. This proves 1) and 2). We have already verified that the eigenvalue of the Casimir Casr on any Z, is (7+pr,7+pr) - (Pr,Pr). This proves 3). Consider the irreducible representation Vp of g corresponding to p = pg. By the same arguments, any weight ry p of V lying in D must satisfy (-y, 7y) < (p, p) and hence any weight r' of Vp satisfying (y, y) = (p, p) must be of the form ='wp for a unique w E W. But wp=p- S =p-j where J.:w(-D+) (p - #1, p - #1) (10.21) where we have strict inequality unless J= Je for some w e W. Now let A E A, let VA be the corresponding irreducible module with highest weight A and let ry be a weight of VA. As usual, let J denote a subset of the positive roots, J C 4+. We claim that Proposition 31 We have (A+p,A+p) (7+p- j,7+ p- #j) (10.22) with strict inequality unless there exists a w e W such that S= wA, and J = J in which case the w is unique. Proof. Choose w such that w-1('Y + p - #) E A. Since w-1(y) is a weight of VA, A - w-1(y) is a sum (possibly empty) of positive roots. Also w-1(p - 5J) is a weight of Vp and hence p - w-1(p - 5J) is a sum (possibly empty) of positive roots. Since A + p =(A - w1Y) + (p - w1(p - J) + w1(Y + p -J)) we conclude that (A+p,A+p) (w-1(Y+p-0bj),w1(Y+p- zj)) _Q(Y+p- 'JY+p-0~iJ) with strict inequality unless A - w-1(y) = 0 = p - w-1(p - #1), and this last equality implies that J = J,. QED We have the spin representation Spin v where v : r - C(p). Call this module S. Consider VA ® S as a r module. Then, letting r' denote a weight of VA, we have F(VA®S) ={p= 7+Pp -#} (10.23) where 1 JEEP In other words, 1p are the roots of g which are not roots of r, or, put another way, they are the weights of p considered as a r module. (Our equal rank  176 CHAPTER 10. THE KOSTANT DIRAC OPERATOR assumption says that 0 does not occur as one of these weights.) For the weights p of VA 0 S the form (10.23) gives p+Pr=7+p-#J, JcAp. So if we set Z = VA 0 S as a r module, (10.22) says that (A+p,A+p) >mz. But we may take J = 0 as one of our weights showing that mz = (A+pg,A+pg). (10.24) To determine Fmax(Z) as in Prop. 30 we again use Prop.31 and (10.23): A p = 7 + pp - # j belongs to Fmax(Z) if and only if 7y= wA and J = J,. But then Pg -# J=wpg. Since pg = Pr + pp we see from the form (10.23) that pt + pr = w(A+ pg) (10.25) where w is unique, and Jw Cp+ We claim that this condition is the same as the condition w(D) c Dr defining our cross-section, C. Indeed, w E C if and only if (0, wpg) > 0, V E 4r. But (5, wp) = (w-1#, p) > 0 if and only if # E w(Q+). Since Jw = w(-@+) o 4+, we see that J c P is equivalent to the condition w E C. Now for E EFmax(Z) we have p=w(A+p)-pr=: wA (10.26) where 7 = w(A) and so has multiplicity one in VA. Furthermore, we claim that the weight pp - #jw has multiplicity one in S. Indeed, consider the representation Zp, ® S of r. It has the weight p = Pr + pp as a highest weight, and in fact, all of the weights of Vpg occur among its weights. Hence, on dimensional grounds, say from the Weyl character formula, we conclude that it coincides, as a rep- resentation of r, with the restriction of the representation Vpg to r. But since pg - #7, = wpg has multiplicity one in Vpg, we conclude that pp - #jw has multiplicity one in S. We have proved that each of the w e A have multiplicity one in VA 0 S with corresponding weight vectors Z OA :=VWA ® e e+  10.6. EIGENVALUES OF THE DIRAC OPERATOR. 177 So each of the submodules Zw. :=U(r)zw.A (10.27) occurs with multiplicity one in VA 0 S. The length of w E C (in terms of the simple reflections of W determined by A) is the number of positive roots changed into negative roots, i.e. the cardinality of Jw. This cardinality is the sign of det w and also determines whether e_ e+ belongs to S+ or to S_. From Prop.31 and equation (10.24) we know that the maximum eigenvalue of Casr on VA 0 S is (A+pg,A+pg) - (pr,pr). Now Ka E End(VA 0 S) commutes with the action of r with VA 0 S+ VA ® S- VA®S_ >VA®S+. Furthermore, by (10.19), the kernel of 5 is the eigenspace of Casr correspond- ing to the eigenvalue (A + p, A + p) - (pr, Pr). Thus Ker(KA) = wEC Each of these modules lies either in V ® S+ or V 0 S_, one or the other but not both. Hence Ker(jk() =Ker(KA) and so Ker(KA) V®s+ =Zw.a(10.28) wEC, detw=1 and Ker(A) V®sZw. (10.29) wEC, detw=-1 Let K :=Zw.. (10.30) wEC, det w=1l It follows from (10.28) that ka induces an injection of (VA oS+)/K+ - VoS- which we can follow by the projection VA0S_ - (VA S_)/K. Hence Ka induces a bijection Ka : (V & S+)IK+ ---> (VA®S_)/K_. (10.31)  178 CHAPTER 10. THE KOSTANT DIRAC OPERATOR In short, we have proved that the sequence O--- K+ OVA&3+ OVA&S_ -->K_ -- 0 (10.32) is exact in a very precise sense, where the middle map is the Kostant Dirac operator: each summand of K+ occurs exactly once in VA 0 S+ and similarly for K_. This gives a much more precise statement of Theorem 16 and a completely different proof. 10.7 The geometric index theorem. Let r be the representation of G on the space F(G) of smooth or on L2(G) of L2 functions on G coming from right multiplication. Thus [r(g)f](a)_= f (ag). Then ( acts on F(G) ® S or on L2(G) ® S and centralizes the action of diag r. If U is a module for R, we may consider F(G) ® S ® U or L2(G) ® S ® U, and C 0 1 commutes with diag r ® 1 and with the action p of R on U, i.e with 1 ® 1 ® p. If R is connected, this implies that ( commutes with the diagonal action of R, the universal cover of R, on F 0 S 0 U or L2 (G) 0 S 0 U given by k F r(k) OSpin(k)0op(k), k e R where Spin : R - Spin(p) is the group homomorphism corresponding to the Lie algebra homomorphism v. If G/R is a spin manifold, the invariants under this R action correspond to smooth or L2 sections of S 0 U where S is the spin bundle of G/R and U is the vector bundle on G/R corresponding to U. Thus ( descends (by restriction) to a differential operator 0 on G/R and we shall compute its G-index for irreducible U. The key result, due to Landweber, asserts that if U belongs to a multiplet coming from an irreducible V of G, then this index is, up to a sign, equal to V. If U does not belong to a multiplet, then this index is zero. We begin with some preliminary results due to Bott. 10.7.1 The index of equivariant Fredholm maps. Let E and F be Hilbert spaces which are unitary modules for the compact Lie group G. Suppose that E=eEn, F=eFn n n are completed direct sum decompositions into subspaces which are G-invariant and finite dimensional, and that T:E -F  10.7. THE GEOMETRIC INDEX THEOREM. 179 is a Fredholm map (finite dimensional kernel and cokernel) such that T(En) c Fa. We write IndexG T = Ker T - Coker T as an element of R(G), the ring of virtual representations of G. Thus R(G) is the space of finite linear combinations E ,aVA, a, E Z as VA ranges over the irreducible representations of G. (Here, and in what follows, we are regard- ing any finite dimensional representation of G as an element of R(G) by its decomposition into irreducibles, and similarly the difference of any two finite dimensional representations is an element of R(G).) If we denote the restriction of T to En by Tn, then IndexG T = IndexG Tn where all but a finite number of terms on the right vanish. For each n we have the exact sequence 0 - KerTn -En - F- CokerTn - 0. Thus IndexG Tn = En - Fn as elements of R(G). Therefore we can write IndexG T =Z(En - Fn) (10.33) in R(G), where all but a finite number of terms on the right vanish. We shall refer to this as Bott's equation. 10.7.2 Induced representations and Bott's theorem. Let R be a closed subgroup of G. Given any R-action p on a vector space U, we consider the associated vector bundle G x R V over the homogeneous space G/R. The sections of this bundle are then equivariant U-valued functions on G satisfying s(gk) = p(k)--s(g) for all k E R. Applying the Peter-Weyl theorem, we can decompose the space of L2 maps from G to U into a sum over the irreducible representations VA of G, L2(G) 0 U a V * 0 U, with respect to the G x G x R action 1 0 r 0 p. The R-equivariance condition is equivalent to requiring that the functions be invariant under the diagonal R-action k H r(k) ® p(k). Restricting the Peter-Weyl decomposition above to the R invariant subspace, we obtain L2(GxRU) OAVA 0(V*0U)( a OYA 0 HomR(VA, U).(1.4  180 CHAPTER 10. THE KOSTANT DIRAC OPERATOR The Lie group G acts on the space of sections by l(g), the left action of G on functions, which is preserved by this construction. The space L2 (G x H U) is thus an infinite dimensional representation of G. The intertwining number of two representations gives us an inner product (V, W)G = dimc HomG(V, W) on R(G), with respect to which the irreducible representations of G form an orthonormal basis. Taking the formal completion of R(G), we define R(G) to be the space of possibly infinite formal sums E aAVA. The intertwining number then extends to a pairing R(G) x R(G) - Z. If R is a subgroup of G, every representation of G automatically restricts to a representation of R. This gives us a pullback map i* : R(G) -R(R), corresponding to the inclusion i : R <-> G. The map U F L2(G x H U) dis- cussed above assigns to each R-representation an induced infinite dimensional G-representation. Expressed in terms of our representation ring notation, this induction map becomes the homomorphism i, : R(R) -> R(G) given by i U i*VA, U)RVA, A the formal adjoint to the pullback i* . This is the content of the Frobenius reciprocity theorem. A homogeneous differential operator on G/R is a differential operator D F(S) -> F(F) between two homogeneous vector bundles E and F that commutes with the left action of G on sections. If the operator is elliptic, then its kernel and cokernel are both finite dimensional representations of G, and thus its G-index is a virtual representation in R(G). In this case, the index takes a particularly elegant form. Theorem 18 (Bott) If D : F(G x H Uo) -> F(G x H U1) is an elliptic homoge- neous differential operator, then the G-equivariant index of D is given by IndexG D = i,(Uo - U1), where i (Uo - U1) is a finite element in R(G), i.e. belongs to R(G). In particular, note that the index of a homogeneous differential operator depends only on the vector bundles involved and not on the operator itself! To prove the theorem, just use Bott's formula (10.33), where the subscript n is replaced by A labeling the the G-irreducibles. 10.7.3 Landweber's index theorem. Suppose that G is semi-simple and simply connected and R is a reductive sub- group of maximal rank. Suppose further that C/R is a spin manifold, then we can compose the spin representation S =S+ e S- of Spin(p) with the lifted  10.7. THE GEOMETRIC INDEX THEOREM. 181 map Spin : R - Spin(p) to obtain a homogeneous vector bundle, the spin bun- dle S over G/R. For any representation of R on U the Kostant Dirac operator descends to an operator oU : F(S+. oi) - F(S® oil). (This operator has the same symbol as the Dirac operator arising from the Levi-Civita connection on G/R twisted by U, and has the same index by Bott's theorem. For the precise relation between this Dirac operator coming from ( and the dirac operator coming from the Levi- civita connection we refer to Landweber's thesis.) The following theorem of Landweber gives an expression for the index of this Kostant Dirac operator. In particular, if we consider G/T, where T is a maximal torus (which is always a spin manifold), this theorem becomes a version of the Borel-Weil-Bott theorem expressed in terms of spinors and the Dirac operator, instead of in its customary form involving holomorphic sections and Dolbeault cohomology. Theorem 19 (Landweber) Let G/R be a spin manifold, and let U1 be an irreducible representation U1 of R with highest weight p. The G-equivariant index of the Dirac operator pU is the virtual G-representation IndexG QUA, (i)dimP/2 (_1)WVw(1+pH)-pG (10.35) if there exists an element w E WG in the Weyl group of G such that the weight w( p + PH) - PG is dominant for G. If no such w exists, then IndexG U 0. Proof. For any irreducible representation VA of G with highest weight A we have VA ®(S+ - S-) (-1)wUw.A wCC by [GKRS]. Hence HomR(VAO(S+ -S-),Up) =0 if p # w . A for some w E C while HomR(VA ® (S+ - S-), Up) =(1)w if p = w e A. But, by (10.33) and Theorem 18 we have IndexGcU e xA ( A (±S+-S-) U)R Hom(V (S+ - S-)*, U ). Now (S+ - S-)* = S+ - S- if dim p 0 mod(4) while (S+ - S-)* = S- - S+ if dim p 2 mod(4). Hence IndexG U ( = )dimp/2eHom(VA ® (S+ - S-), U ). (10.36)  182 CHAPTER 10. THE KOSTANT DIRAC OPERATOR The right hand side of (10.36) vanishes if p does not belong to a multiplet, i.e is not of the form weA=w(A+pg)-pr for some A. The condition w " A = p can thus be written as w-1(p+ pr) - pg A. If this equation does hold, then we get the formula in the theorem (with w replaced by w-1 which has the same determinant). QED In general, if G/R is not a spin manifold, then in order to obtain a similar result we must instead consider the operator P, : (L2(G) ® (S+) 0 U )r (L2(G) 0 (ST) 0®U )r viewed as an operator on G, restricted to the space of (S® U,1)-valued functions on G that are invariant under the diagonal r-action g(Z) = diag(Z) + o(Z), where o is the r-action on U,. Note that if So U, is induced by a representation of the Lie group R, then this operator descends to a well-defined operator on G/R as before. In general, the G-equivariant index of this operator OU, is once again given by (10.35). To prove this, we note that Bott's identity (10.33) and his theorem continue to hold for the induction map i, : R(r) - R(g) using the representation rings for the Lie algebras instead of the Lie groups. Working in the Lie algebra context, we no longer need concern ourselves with the topological obstructions occurring in the global Lie group picture. The rest of the proof of Theorem 19 continues unchanged.  Chapter 11 The center of U(g). The purpose of this chapter is to study the center of the universal enveloping algebra of a semi-simple Lie algebra g. We have already made use of the (second order) Casimir element. 11.1 The Harish-Chandra isomorphism. Let us return to the situation and notation of Section 7.3. We have the monomial basis Q 1 m of U(g), the decomposition U(g) = U(h) e (U(g)n+ + nU(g)) and the projection ry : U(g) ' U(h) onto the first factor of this decomposition. This projection restricts to a projec- tion, also denoted by ry ry : Z(g) ' U(h). The projection ry: Z(g) - U(h) is a bit awkward. However Harish-Chandra showed that by making a slight modification in ry we get an isomorphism of Z(g) onto the ring of Weyl group invariants of U(h) = S(h). Harish-Chandra's modification is as follows: As usual, let a>o Recall that for each i, the reflection s2 sends a2 h- -a and permutes the remaining positive roots. Hence sip= p - as 183  184 CHAPTER 11. THE CENTER OF U(G). But by definition, sip = p - (p, aoi)ai and so (p,ai) 1 for all i= 1, .. . ,m. So p=W1+---+Wm, i.e. p is the sum of the fundamental weights. 11.1.1 Statement Define o-:h->U(h), o-(h) = h - p(h)1. (11.1) This is a linear map from h to the commutative algebra U(h) and hence, by the universal property of U(h), extends to an algebra homomorphism of U(h) to itself which is clearly an automorphism. We will continue to denote this automorphism by o-. Set 7YH:= o 0'y. Then Harish-Chandra's theorem asserts that 7H :Z(g) -> U(h)w and is an isomorphism of algebras. 11.1.2 Example of sl(2). To see what is going on let's look at this simplest case. The Casimir of degree two is -h2 + ef + fe, 2 as can be seen from the definition. Or it can be checked directly that this element is in the center. It is not written in our standard form which requires that the f be on the left. But ef = fe + [e, f] = fe + h. So the way of writing this element in terms of the above basis is -h2 + h+ 2fe, 2 and applying 7' to it yields -h2 + h. 2 There is only one positive root and its value on h is 2, so p(h) = 1. Thus o sends jh2 + h into 2 2 2 2 The Weyl group in this case is just the identity together with the reflection h H- -h, and the expression on the right is clearly Weyl group invariant.  11.1. THE HARISH-CHANDRA ISOMORPHISM. 185 11.1.3 Using Verma modules to prove that yH : Z(g) - U(h)w. Any p E h* can be thought of as a linear map of h into the commutative algebra, C and hence extends to an algebra homomorphism of U(h) to C. If we regard an element of U(h) = S(h) as a polynomial function on h*, then this homomorphism is just evaluation at p. From our definitions, (A - p)(z) = YH(z)(A) Vz E Z(g). Let us consider the Verma module Verm(A - p) where we denote its highest weight vector by v+. For any z E Z(g), we have hzv+ = zhv+ = (A - p)(h)zv+ and e2zv+ = zeev+ = 0. So zv+ is a highest weight vector with weight A - p and hence must be some multiple of v+. Call this multiple cp(z). Then z f - -1 -.fm v+ = f1- -.- fm" (z)v+, so z acts as scalar multiplication by cp(z) on all of Verm(A - p). To see what this scalar is, observe that since z - -y(z) e U(g)n+, we see that z has the same action on v+ as does -y(z) which is multiplication by (A - p)(y(z)) = A(yH(z)). In other words, Pa(z) = A(7H(z)) -- yH (z) (A). Notice that in this argument we only used the fact that Verm(A-p) is a cyclic highest weight module: If V is any cyclic highest weight module with highest weight p - p then z acts as multiplication by c1(z) =p(AyH(z)) -- H (z)( ). We will use this observation in a moment. Getting back to the case of Verm(A - p), for a simple root a =a2 let m = m2 := KA, ai) and suppose that m is an integer. The element f" v+ e Verm(A - p), where p =A - p - ma = s2A - p. Now from the point of view of the sl(2) generated by e2, f2, the vector v+ is a maximal weight vector with weight m - 1. Hence e2f7mv+ = 0. Since [eg, f] = 0, i j we have e f7mv+ = 0 as well. So f7mv+ f 0 is a maximal weight vector with weight s2A - p. Call the highest weight module it generates, M. Then from M we see that Hc ea (z) = ppe(z). Hence we have proved that (YH(z))(wA) m-H(z))(A) V w e W  186 CHAPTER 11. THE CENTER OF U(G). if A is dominant integral. But two polynomials which agree on all dominant integral weights must agree everywhere. We have shown that the image of -Y lies in S(h)w. Furthermore, we have ziz2 - 7(z1)(z2) = z1(z2 - y(z2)) + y(z2)(z1 - y(z2)) e U(g)n+- So 7(zi z2) = 7Y(z1)-Y(z2). This says that r' is an algebra homomorphism, and since yH = oa o y where o is an automorphism, we conclude that yH is a homomorphism of algebras. Equally well, we can argue directly from the fact that z E Z(g) acts as multiplication by A(z) = yH(z)(A) on Verm(A - p) that yH is an algebra homomorphism. 11.1.4 Outline of proof of bijectivity. To complete the proof of Harish-Chandra's theorem we must prove that -yH is a bijection. For this we will introduce some intermediate spaces and ho- momorphisms. Let Y(g) := S(g)g denote the subspace fixed by the adjoint representation (extended to the symmetric algebra by derivations). This is a subalgebra, and the filtration on S(g) induces a filtration on Y(g). We shall produce an isomorphism f : Y(g) - S(h)w We also have a linear space isomorphism of U(g) - S(g) given by the symmetric embedding of elements of Sk(g) into U(g), and let s be the restriction of this to Z(g). We shall see that s : Z(g) - Y(g) is an isomorphism. Finally, define Sk(g) = So(g) e S1(g) e ---.e Sk(g) so as to get a filtration on S(g). This induces a filtration on S(h) c S(g). We shall show that for any z e Uk(g) n Z(g) we have (f 0 s)(z) --H(z) mod Sk_1(g). This proves inductively that yH is an isomorphism since s and f are. Also, since o does not change the highest order component of an element in S(h), it will be enough to prove that for z E Uk(g) n Z(g) we have (f a s)(z) - -y(z) mod Sk_1(g). (11.2) We now proceed to the details.  11.1. THE HARISH-CHANDRA ISOMORPHISM. 187 11.1.5 Restriction from S(g*)g to S(h*)w. We first discuss polynomials on g that is elements of S(g*). Let T be a finite dimensional representation of of g, and consider the symmetric function F of degree k on g given by (X1, .. . ,Xk) a tr (r(Xlr) -- (Xk)) where the sum is over all permutations. For any Y E g, by definition, Y F(X1, ... , Xk) = F((Y, X1], X2, ... , Xk) + ---+ F(X1, ... , Xk_1,([Y, Xk]). Applied to the above F(X1,...,Xk) -tr r(X1)---r(Xk) we get trT(Y)T(X1) ... T(Xk) -trT(X1)T(Y) ... T(Xk) +tr T(X1)T(X)T(X2 ..-. T(Xk) - =trT(Y)T(X1) .-.T(Xk) --trT(X1) .-.T(Xk)T(Y) =0. In other words, the function X tr T(X)" belongs to S(g*)g. Now since h is a subspace of g, the restriction map induces a homomorphism, r: S(g*) ,S(h*). If F E S(g*)g, then, as a function on g it is invariant under the automorphism i :=(exp ad e2) (exp ad -f2)(exp ad e2) and hence r : S(g*)E S(h*)w. If F E S(g*)g is such that its restriction to h vanishes, then its value at any element which is conjugate to an element of h (under E(g) the subgroup of automorphisms of g generated by the T) must also vanish. But these include a dense set in g, so F, being continuous, must vanish everywhere. So the restriction of r to S(g*)g is injective. To prove that it is surjective, it is enough to prove that S(h*)w is spanned by all functions of the form X H tr T(X)k as T ranges over all finite dimensional representations and k ranges over all non-negative integers. Now the powers of any set of spanning elements of h* span S(h*). So we can write any element of S(h*)w as a linear combination of the AAk where A denotes averaging over W. So it is enough to show that for any dominant weight A, we can express Ak in terms of tr Tk. Let EA denote the (finite) set of all dominant weights - A. Let T denote the finite dimensional representation with highest weight A. Then trT(X)k - AAk(X) is a combination of Ap (X)k where p E EA. Hence by induction on the finite set EA we get the desired result. In short, we have proved that r : S(g*)g S(h*)w is bijective.  188 CHAPTER 11. THE CENTER OF U(G). 11.1.6 From S(g)g to S(h)w. Now we transfer all this information from S(g*) to S(g): Use the Killing form to identify g with g* and hence get an isomorphism a : S(g) - S(g*). Similarly, let /3: S(h) - S(h*) be the isomorphism induced by the restriction of the Killing form to h, which we know to be non-degenerate. Notice that 3 commutes with the action of the Weyl group. We can write S(g) = S(h) + J where J is the ideal in S(g) generated by n+ and n_. Let j :S(g) -S(h) denote the homomorphism obtained by quotienting out by this ideal. We claim that the diagram S(g) > >S(g* ) S(h)' > S(h*) commutes. Indeed, since all maps are algebra homomorphisms, it is enough to check this on generators, that is on elements of g. If X E g, then (h,ra(X)) (h,a(X)) (h,X) where the scalar product on the right is the Killing form. But since h is orthog- onal under the Killing form to n+ + n_, we have (h, X) = (h,jX) = (h,,3(jX)). QED Upon restriction to the g and W-invariants, we have proved that the right hand column is a bijection, and hence so is the left hand column, since /3 is a W- module morphism. Recalling that we have defined Y(g) := S(g)g, we have shown that the restriction of j to Y(g) is an isomorphism, call it f: f : Y(g) - S(h)w. 11.1.7 Completion of the proof. Now we have a canonical linear bijection of S(g) with U(g) which maps Sr) 1 --Xa1 T1-T X,  11.2. CHEVALLEY'S THEOREM. 189 where the multiplication on the left is in S(g) and the multiplication on the right is in U(g) and where Er denotes the permutation group on r letters. This map is a g module morphism. In particular this map induces a bijection s : Z(g) - Y(g). Our proof will be complete once we prove (11.2). This is a calculation: write uAJB := fAhJeB for our usual monomial basis, where the multiplication on the right is in the universal enveloping algebra. Let us also write PAJB :"_fAhJeB=f ... ... hj ... ,ES(g) where now the powers and multiplication are in S(g). The image of uAJB under the canonical isomorphism with of U(g) with S(g) will not be pAJB in general, but will differ from pAJB by a term of lower filtration degree. Now the projection y : U(g) - U(h) coming from the decomposition U(g) = U(h) e (nuU(g) + U(g)n+) sends uAJB H 0 unless A = 0 = B and is the identity on uoJo. Similarly, j(pAJB)=0 unless A=0=B and i(PoJo) = Polo = hJ- These two facts complete the proof of (11.2). QED 11.2 Chevalley's theorem. Harish Chandra's theorem says that the center of the universal enveloping alge- bra of a semi-simple Lie group is isomorphic to the ring of Weyl group invariants in the polyinomial algebra S(h). Chevalley's theorem asserts that this ring is in fact a polynomial ring in £ generators where £ = dim h. To prove Cheval- ley's theorem we need to call on some facts from field theory and from the representation theory of finite groups. 11.2.1 Transcendence degrees. A field extension L : K is finitely generated if there are elements a1, ... , an of L so that L = K(a1, ... , an). In other words, every element of L can be written as a rational expression in the a1,... , an. Elements ti,.. . , tk of L are called (algebraically) independent (over K) if there is no non-trivial polynomial p with coefficients in K such that p(tl, ... , tk) = 0.  190 CHAPTER 11. THE CENTER OF U(G). Lemma 15 If L : K is finitely generated, then there exists an intermediate field M such that M = K(1, . .. , aCr) where the a1,... , ar are independent transcen- dental elements and L : M is a finite extension (i. e. L has finite dimension over M as a vector space). Proof. We are assuming that L = K(#31,. .. , 3q). If all the 32 are algebraic, then L : K is a finite extension. Otherwise one of the 32 is transcendental. Call this a1. If L : K(ai) is a finite extension we are done. Otherwise one of the remaining 32 is transcendental over K(ai). Call it a2. So a1, a2 are independent. Proceed. Lemma 16 If there is another collection '71 ... y so that L : K(11,..., 'ye) is finite then r = s. This common number is called the transcendence degree of L over K. Proof. If s = 0, then every element of L is algebraic, contradicting the assump- tion that the a1,... , ar are independent, unless r = 0. So we may assume that s > 0. Since L : M is finite, there is a polynomial p such that p(7Y1,a1, . .. , ar) = 0. This polynomial must contain at least one a, since ry1 is transcendental. Renum- ber if necessary so that a1 occurs in p. Then a1 is algebraic over K(7y1, a2, ... , ar) and L : K(y1, a2, ... , ar) is finite. Continuing this way we can successively re- place as by 'ys until we conclude that L : K(y1,... , 'y7) is finite. If s > r then the -ys are not algebraically independent. so s _1 denote the elementary symmetric functions in n - 1 variables, we have sj=tns1 + s' so K(s1, . .., sn, in) = K(tn, s'i, . . ., s'n_1). By induction we may assume that dimK(ti, . .. , in) : K(s1, . .. , sn, in) = dimK(tn)(ti, . .. , to_1) : K(tn)(s'1, . .. , s's_1) < (n - 1)! proving that dimK(ti, . . ,tn) : K(s1, . . , sn) K n!. A fundamental theorem of Galois theory says Theorem 20 Let G be a finite subgroup of the group of automorphisms of the field L over the field K, and let F be the fixed field. Then dim[L : F] =_#G. This theorem, whose proof we will recall in the next section, then completes the proof of the proposition. The proposition implies that every symmetric polynomial is a rational function of the elementary symmetric functions. In fact, every symmetric polynomial is a polynomial in the elementary sym- metric functions, giving a stronger result. This is proved as follows: put the lexicographic order on the set of n-tuples of integers, and therefor on the set of monomials; so x1" -4-- xn is greater than x -1 --z"xn in this ordering if i1 > ji of i1 = ji and i2 > j2 or etc. Any polynomial has a "leading monomial" the greatest monomial relative to this lexicographic order. The leading monomial of the product of polynomials is the product of their leading monomials. We shall prove our contention by induction on the order of the leading monomial. Notice that if p is a symmetric polynomial, then the exponents i1,... , in of its leading term must satisfy for otherwise the monomial obtained by switching two adjacent exponents (which occurs with the same coefficient in the symmetric polynomial, p) would be strictly higher in our lexicographic order. Suppose that the coefficient of this leading monomial is a. Then i1-i2 i2 -i3 n -1 -n in q-as1 s2 '''sn-1 si has the same leading monomial with the same coefficient. Hence p - q has a smaller leading monomial. QED  192 CHAPTER 11. THE CENTER OF U(G). 11.2.3 Fixed fields. We now turn to the proof of the theorem of the previous section. Lemma 17 Every distinct set of monomorphisms of a field K into a field L are linearly independent over L. Let A1,... , An be distinct monomorphisms of K - L. The assertion is that there can not exist ai, ... , an E L such that a1A1(x)+ --. + an An(x)-- 0 Vx E K unless all the a2 = 0. Assume the contrary, so that such an equation holds, and we may assume that none of the a2 = 0. Looking at all such possible equations, we may pick one which involves the fewest number of terms, and we may assume that this is the equation we are studying. In other words, no such equation holds with fewer terms. Since A1 f An, there exists a y E K such that A1(y) # An(y) and in particular y f 0. Substituting yx for x gives a1A1(yx) + --. + aA, (yx) = 0 so a1A1(y)A1(x) + ... + aA,(y)An(x) = 0 and multiplying our original equation by A1(y) and subtracting gives a2 (A1(y) - A2(y))A2(x) + ... + an(An(y) - A1(y))An(x) = 0 which is a non-trivial equation with fewer terms. Contradiction. Let n = #G, and let the elements of G be gi = 1,... , g,. Suppose that dim L : F = m < n. Let x1i,. .. , Xm be a basis of L over F. The system of equations 91(xy)y1 +-.-.-+ gn(xy)yn = 0, j = 1, ..., m has more unknowns than equations, and so we can find non-zero y1,...", y solving these equations. Any b E L can be expanded as b=bix1+---+bmXm, b2 EF, and so g1(b)y1 + -.-. + gn (b)yn = Iby [g1(xy)y1 + - -+ gn(xy )yn] =0 showing that the monomorphisms gj are linearly dependent. This contradicts the lemma. Suppose that dim L : F > n. Let x1, ... , x, xn+1 be linearly independent over F, and find y1, ... yn+1 E L not all zero solving the n equations gj(xl )gl + ... + gj(xn+l)gn+l = 0, j = 1, ... , n.  11.2. CHEVALLEY'S THEOREM. 193 Choose a solution with fewest possible non-zero y's and relabel so that the first are the non-vanishing ones, so the equations now read gj (x1)y1 +-.- + gj (xr)yr = 0, j = 1, ..., n, and no such equations hold with less than r y's. Applying g E G to the preceding equation gives ggj(x1)g(yl) +-.- + gg3 (xr)g(yr) = 0. But ggj runs over all the elements of G, and so g(y1), ..., g(yr) is a solution of our original equations. In other words we have gj(x)y + --. + gj (xr)yr = 0 and gj(x1)g(y1) + -.. + g (xr)g(yr) = 0. Multiplying the first equations by g(yi), the second by Y1 and subtracting, gives gj(x2)[y2g(yl) --g(2)yl)] + -.-.-+ gj(zr)[yrg(y1) -g(yr)y1] = 0, a system with fewer y's. This can not happen unless the coefficients vanish, i.e. yig(y1) yg(yi) or yy1 =g(yiy71) Vg e G. This means that yy1 e F. Setting zi = yi/y1 and k = yi, we get the equation zikz1 + -.-. + zekzr = 0 as the first of our system of equations. Dividing by k gives a linear relation among x1, ... , xr contradicting the assumption that they are independent. 11.2.4 Invariants of finite groups. Let G be a finite group acting on a vector space. It action on the symmetric algebra S(V*) which is the same as the algebra of polynomial functions on V by (gf)(v) = f(g-1v) Let R = S(V*)G be the ring of invariants. Let S = S(V*) and L be the field of quotients of S, so that L = K(t1,... , tn) where rn= dimV. From the theorem on fixed fields, we know that the dimension of L as an extension of LG is equal to the number of elements in C, in particular finite. So LG has transcendence degree n over the ground field.  194 CHAPTER 11. THE CENTER OF U(G). Clearly the field of fractions of R is contained in LG. We claim that they coincide. Indeed, suppose that p, q e S, p/q E LG. Multiply the numerator and denominator by IIgp the product taken over all g E G, g - 1. The new nu- merator is G invariant. Therefore so is the denominator, and we have expressed p/q as the quotient of two elements of R. If the finite group G acts on a vector space, then averaging over the group, i.e. the map E f - f := # is a projection onto the subspace of invariant elements: A : f a f# E EG. In particular, if E is finite dimensional, dim EG=trA. (11.3) We may apply the averaging operator to our (infinite dimensional) situation where S = S(V*) and R = SG in which case we have the additional obvious fact that (pq)K=piq VpeS, qeR. Let R+ C R denote the subring of R consisting of elements with constant term zero. Let I := SR+ so that I is an ideal in S. By the Hilbert basis theorem (whose proof we recall in the next section) the ideal I is finitely generated, and hence, from any set of generators, we may choose a finite set of generators. Theorem 21 Let f1......, fr be homogeneous elements of R+ which generate I as an ideal of S. Then f,.... . , fr together with 1 generate R as a K algebra. In particular, R is a finitely generated K algebra. Proof. We must show that any f E R can be expressed as a polynomial in the f,.. ., fr, and since every f is a sum of its homogeneous components, it is enough to do this for homogeneous f and we proceed by induction on its degree. The statement is obvious for degree zero. For positive degree, f E R C I so f =sif1+---+srfr, si E I and since f, fi, ... . , fr are homogeneous, we may assume the s2 are homogeneous of degree deg f - deg f, since all other contributions must cancel. Now apply A to get f = sif1i+ --. + sefr. The s lie in R and have lower homogeneous degree than f, and hence can be expressed as polynomials in fi,,... , fr. Hence so can f.  11.2. CHEVALLEY'S THEOREM. 195 11.2.5 The Hilbert basis theorem. A commutative ring is called Noetherian if any of the following equivalent con- ditions holds: 1. If I1 c I2 c - - - is an ascending chain of ideals then there is a k such that Ik-Ik+1 -Ik+2---- 2. Every non-empty set of ideals has a maximal element with respect to inclusion. 3. Every ideal is finitely generated. The Hilbert basis theorem asserts that if R is a Noetherian ring, then so is the polynomial ring R[X]. In particular, all ideals in K[X1, ... , Xn] are finitely generated. Let I be an ideal in RX and for any positive integer k let Lk (I) C R be defined by k Lk(I) :--{ak ECR Iak-1,...,a1 E1R with ZajX3 EI}. 0 For each k, Lk(I) is an ideal in R. Multiplying by X shows that Lk(I) c Lk+1(I). Hence these ideals stabilize. If I c J and Lk(I) -- Lk(J) for all k, we claim that this implies that I = J. Indeed, suppose not, and choose a polynomial of smallest degree belonging to J but not to I, say this degree is k. Its leading coefficient belongs to Lk(J) and can not belong to Lk(I) because otherwise we could find a polynomial of smaller degree belonging to J and not to I. Proof of the Hilbert basis theorem. Let Io C I1 C ... be an ascending chain of ideals in R[X]. Consider the set of ideals Lp,(Iq). We can choose a maximal member. So for k > p we have Lk (I3) --Lk (Iq) Vj > q. For each of the finitely many values j = 1, . . . , p - 1, the ascending chains Li(Io) C Li(I1) C -.-. stabilizes. So we can find a large enough r (bigger than the finitely many large values needed to stabilize the various chains) so that Li(1) = Li(Ir) Vj > r, Vi. This shows that IJ = Ir Vj r.  196 CHAPTER 11. THE CENTER OF U(G). 11.2.6 Proof of Chevalley's theorem. This says that if K = R and W is a finite subgroup of O(V) generated by reflections, then its ring of invariants is a polynomial ring in n- generators, where n = dim V. Without loss of generality we may assume that W acts effectively, i.e. no non-zero vector is fixed by all of W. Let f1......, fr be a minimal set of homogeneous generators. Suppose we could prove that they are algebraically independent. Since the transcendence degree of the quotient field of R is n = dim V, we conclude that r = n. So the whole point is to prove that a minimal set of homogeneous generators must be algebraically independent - that there can not exist a non-zero polynomial h=h(y1,...,yr) so that h(fi1, . .. , fr) = 0. (11.4) So we want to get a smaller set of generators assuming that such a relation exists. Let di :=deg fi,..., d := deg fr. For any non-zero monomial ay1 ---yfr occurring in h the term af" ...ref we get by substituting f's for y's has degree d=eid1+ --erdr and hence we may throw away all monomials in h which do not satisfy this equation. Now set hi : (fi1> , ---,fr) By2 so that h2 E R is homogeneous of degree d - d2, and let J be the ideal in R generated by the h2. Renumber fi,..., f, so that h1,... , hm is a minimal generating set for J. This means that m hi = gijhj, g2gjER j=1 for i > m (if m < r; if m = r we have no such equations). Once again, since the h2 are homogeneous of degree d-d2 we may assume that each gij is homogeneous of degree d2 - d by throwing away extraneous terms. Now let us differentiate the equation (11.4) with respect to Xk, k = 1, ..., n to obtain h k =f,.  11.2. CHEVALLEY'S THEOREM. 197 and substitute the above expressions for h2, i > m to get Zh, Ot r _Ofd1 h2 + gj2k =1,...,n. .1 = 3m1 i=1 jm+1 Set p2:= + g Z=1,..., 3= jm+1 k so that each pi is homogeneous with deg p2 = d2 - 1 and we have the equation hip1+---hmpm=O. (11.5) We will prove that this implies that P1 E I. (11.6) Assuming this for the moment, this means that + g1k = sum_1 f1ki Dxk . l xk 3=m+1 where q2 E S. Multiply these equations by Xk and sum over k and apply Euler's formula for homogeneous polynomials xk Gfk=(deg f)f. (9xk We get dif1 + dj gjiff= f2fr2 with degr1 > 0 if it is not zero. Once again, the left hand side is homogeneous of degree d1 so we can throw away all terms on the right which are not of this degree because of cancellation. But this means that we throw away the term involving fi, and we have expressed fi in terms of f2,...., fr, contradicting our choice of fi,... , f, as a minimal generating set. So the proof of Chevalley's theorem reduced to proving that (11.5) implies (11.6), and for this we must use the fact that W is generated by reflections, which we have not yet used. The desired implication is a consequence of the following Proposition 33 Let h1,... , hm E R be homogeneous with h1 not in the ideal of R generated by h2......, hm. Suppose that (11.5) holds with homogeneous ele- ments p, e S. Then (11.6) holds.  198 CHAPTER 11. THE CENTER OF U(G). Notice that h1 can not lie in the ideal of S generated h2, ... hm because we can apply the averaging operator to the equation h1 =ik2h2 + ---+ kmhm k2 e S to arrange that the same equation holds with k2 replaced by k E R. We prove the proposition by induction on the degree of p1. This must be positive, since p1 # 0 constant would put h1 in the ideal generated by the remaining h2. Let s be a reflection in W and H its hyperplane of fixed vectors. Then spj-p =0 on H. Let £ be a non-zero linear function whose zero set is this hyperplane. With no loss of generality, we may assume that the last variable, x, occurs with non-zero coefficient in £ relative to some choice of orthogonal coordinates. In fact, by rotation, we can arrange (temporarily) that £ x= z. Expanding out the polynomial sp2 - p2 in powers of the (rotated) variables, we see that sgi - gi must have no terms which are powers of x1,. . ., x_1 alone. Put invariantly, we see that sp2 - p2 = fr2 where r2 is homogeneous of degree one less that that of p2. Apply s to equation (11.5) and subtract to get f (hiri + -.-. hmrm) = 0. Since £ # 0 we may divide by £ to get an equation of the form (11.5) with p1 replaced by r1 of lower degree. So r1 E I by induction. So spi -P1 E I. Now W stabilizes R+ and hence I and we have shown that each w e W acts trivially on the quotient of P1 in this quotient space S/I. Thus p# = Ap1 --P1 mod I. SoP E I since p# e I. QED