Concepts in Calculus III UNIVERSITY PRESS OF FLORIDA Florida A&M University, Tallahassee Florida Atlantic University, Boca Raton Florida Gulf Coast University, Ft. Myers Florida International University, Miami Florida State University, Tallahassee New College of Florida, Sarasota University of Central Florida, Orlando University of Florida, Gainesville University of North Florida, Jacksonville University of South Florida, Tampa University of West Florida, Pensacola Orange Grove Texts Plus   Concepts in Calculus III Multivariable Calculus Sergei Shabanov University of Florida Department of Mathematics UNIVERSITY PRESS OF FLORIDA Gainesville * Tallahassee * Tampa * Boca Raton Pensacola * Orlando * Miami * Jacksonville * Ft. Myers * Sarasota  Copyright 2012 by the University of Florida Board of Trustees on behalf of the University of Florida Department of Mathematics This work is licensed under a modified Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License. To view a copy of this license, visit http:// creativecommons.org/licenses/by-nc-nd/3.0/. You are free to electronically copy, distribute, and transmit this work if you attribute authorship. However, all printing rights are reserved by the University Press of Florida (http://www.upf.com). Please contact UPF for information about how to obtain copies of the work for print distribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work. Any of the above conditions can be waived if you get permission from the University Press of Florida. Nothing in this license impairs or restricts the author's moral rights. ISBN 978-1-61610-162-6 Orange Grove Texts Plus is an imprint of the University Press of Florida, which is the scholarly publishing agency for the State University System of Florida, comprising Florida A&M University, Florida Atlantic University, Florida Gulf Coast University, Florida International University, Florida State University, New College of Florida, University of Central Florida, University of Florida, University of North Florida, University of South Florida, and University of West Florida. University Press of Florida 15 Northwest 15th Street ,Foi Dstn r - Flonida Distance Gainesville, FL 32611-2079 hp Learning Consortium http://www.upf.com  Contents Chapter 11. Vectors and the Space Geometry 1 71. Rectangular Coordinates in Space 1 72. Vectors in Space 12 73. The Dot Product 25 74. The Cross Product 38 75. The Triple Product 51 76. Planes in Space 65 77. Lines in Space 73 78. Quadric Surfaces 82 Chapter 12. Vector Functions 97 79. Curves in Space and Vector Functions 97 80. Differentiation of Vector Functions 111 81. Integration of Vector Functions 120 82. Arc Length of a Curve 128 83. Curvature of a Space Curve 136 84. Applications to Mechanics and Geometry 147 Chapter 13. Differentiation of Multivariable Functions 163 85. Functions of Several Variables 163 86. Limits and Continuity 173 87. A General Strategy for Studying Limits 183 88. Partial Derivatives 196 89. Higher-Order Partial Derivatives 202 90. Linearization of Multivariable Functions 211 91. Chain Rules and Implicit Differentiation 221 92. The Differential and Taylor Polynomials 231 93. Directional Derivative and the Gradient 245 94. Maximum and Minimum Values 257 95. Maximum and Minimum Values (Continued) 268 96. Lagrange Multipliers 278 Chapter and section numbering continues from the previous volume in the series, Concepts in Calculus II.  vi CONTENTS VI CONTENTS Chapter 14. Multiple Integrals 293 97. Double Integrals 293 98. Properties of the Double Integral 301 99. Iterated Integrals 310 100. Double Integrals Over General Regions 315 101. Double Integrals in Polar Coordinates 330 102. Change of Variables in Double Integrals 341 103. Triple Integrals 356 104. Triple Integrals in Cylindrical and Spherical Coordinates 369 105. Change of Variables in Triple Integrals 382 106. Improper Multiple Integrals 392 107. Line Integrals 403 108. Surface Integrals 408 109. Moments of Inertia and Center of Mass 423 Chapter 15. Vector Calculus 437 110. Line Integrals of a Vector Field 437 111. Fundamental Theorem for Line Integrals 446 112. Green's Theorem 458 113. Flux of a Vector Field 470 114. Stokes' Theorem 481 115. Gauss-Ostrogradsky (Divergence) Theorem 490 Acknowledgments 501  CHAPTER 11 Vectors and the Space Geometry Our space may be viewed as a collection of points. Every geometri- cal figure, such as a sphere, plane, or line, is a special subset of points in space. The main purpose of an algebraic description of various objects in space is to develop a systematic representation of these objects by numbers. Interestingly enough, our experience shows that so far real numbers and basic rules of their algebra appear to be sufficient to de- scribe all fundamental laws of nature, model everyday phenomena, and even predict many of them. The evolution of the Universe, forces bind- ing particles in atomic nuclei, and atomic nuclei and electrons forming atoms and molecules, star and planet formation, chemistry, DNA struc- tures, and so on, all can be formulated as relations between quantities that are measured and expressed as real numbers. Perhaps, this is the most intriguing property of the Universe, which makes mathemat- ics the main tool of our understanding of the Universe. The deeper our understanding of nature becomes, the more sophisticated are the mathematical concepts required to formulate the laws of nature. But they remain based on real numbers. In this course, basic mathematical concepts needed to describe various phenomena in a three-dimensional Euclidean space are studied. The very fact that the space in which we live is a three-dimensional Euclidean space should not be viewed as an absolute truth. All one can say is that this mathematical model of the physical space is sufficient to describe a rather large set of physical phenomena in everyday life. As a matter of fact, this model fails to describe phenomena on a large scale (e.g., our galaxy). It might also fail at tiny scales, but this has yet to be verified by experiments. 71. Rectangular Coordinates in Space The elementary object in space is a point. So the discussion should begin with the question: How can one describe a point in space by real numbers? The following procedure can be adopted. Select a particular point in space called the origin and usually denoted 0. Set up three mutually perpendicular lines through the origin. A real number is associated with every point on each line in the following way. The origin corresponds to 0. Distances to points on one side of the line 1  2 11. VECTORS AND THE SPACE GEOMETRY from the origin are denoted by positive real numbers, while distances to points on the other half of the line are denoted by negative numbers (the absolute value of a negative number is the distance). The half-lines with the grid of positive numbers will be indicated by arrows pointing from the origin to distinguish the half-lines with the grid of negative numbers. The described system of lines with the grid of real numbers on them is called a rectangular coordinate system at the origin O. The lines with the constructed grid of real numbers are called coordinate axes. 71.1. Points in Space as Ordered Triples of Real Numbers. The position of any point in space can be uniquely specified as an ordered triple of real numbers relative to a given rectangular coordinate system. Consider a rectangular box whose two opposite vertices (the endpoints of the largest diagonal) are the origin and a point P, while its sides that are adjacent at the origin lie on the axes of the coordinate system. For every point P, there is only one such rectangular box. It is uniquely determined by its three sides adjacent at the origin. Let the number x denote the position of one such side that lies on the first axis; the numbers y and z do so for the second and third sides, respectively. Note that, depending on the position of P, the numbers x, y, and z may be negative, positive, or even 0. In other words, any point in space is associated with a unique ordered triple of real numbers (x, y, z) determined relative to a rectangular coordinate system. This ordered triple of numbers is called rectangular coordinates of a point. To reflect the order in (x, y, z), the axes of the coordinate system will be denoted as x, y, and z axes. Thus, to find a point in space with rectangular coordinates (1, 2, -3), one has to construct a rectangular box with a vertex at the origin such that its sides adjacent at the origin occupy the intervals [0, 1], [0, 2], and [-3, 0] along the x, y, and z axes, respectively. The point in question is the vertex opposite to the origin. 71.2. A Point as an Intersection of Coordinate Planes. The plane con- taining the x and y axes is called the xy plane. For all points in this plane, the z coordinate is 0. The condition that a point lies in the xy plane can therefore be stated as z = 0. The xz and yz planes can be defined similarly. The condition that a point lies in the xz or yz plane reads y =0 or cc= 0, respectively. The origin (0, 0, 0) can be viewed as the intersection of three coordinate planes cc= 0, y =0, and z =0. Consider all points in space whose z coordinate is fixed to a particular value z =zo (e.g., z =1). They form a plane parallel to the zcy plane that lies Izl units of length above it if zo > 0 or below it if zo < 0.  71. RECTANGULAR COORDINATES IN SPACE 3 z z z =zo z' O y Yyo / :j FIGURE 11.1. Left: Any point P in space can be viewed as the intersection of three coordinate planes x =xo, y =yo, and z =zo; hence, P can be given an algebraic description as an ordered triple of numbers P =(xo, yo, zo). Right: Translation of the coordinate system. The origin is moved to a point (xo, yo, zo) relative to the old coordinate system while the coordinate axes remain parallel to the axes of the old system. This is achieved by translating the origin first along the x axis by the distance zo (as shown in the figure), then along the y axis by the distance yo, and finally along the z axis by the distance zo. As a result, a point P that had coordinates (x, y, z) in the old system will have the coordi- nates x'= x -xo, y yo, and z'= z -zo in the new coordinate system. A point P with coordinates (Xo, yo, zo) can therefore be viewed as an intersection of three coordinate planes x = z, y =yo, and z =zo as shown in Figure 11.1. The faces of the rectangle introduced to specify the position of P relative to a rectangular coordinate system lie in the coordinate planes. The coordinate planes are perpendicular to the cor- responding coordinate axes: the plane x = z is perpendicular to the x axis, and so on. 71 .3. Changing the Coordinate System. Since the origin and directions of the axes of a coordinate system can be chosen arbitrarily, the co- ordinates of a point depend on this choice. Suppose a point P has coordinates (x, y, z). Consider a new coordinate system whose axes are  4 11. VECTORS AND THE SPACE GEOMETRY parallel to the corresponding axes of the old coordinate system, but whose origin is shifted to the point 0' with coordinates (X0, 0, 0). It is straightforward to see that the point P would have the coordinates (x - xo, y, z) relative to the new coordinate system (Figure 11.1, right panel). Similarly, if the origin is shifted to a point 0' with coordinates (X0, Yo, zo), while the axes remain parallel to the corresponding axes of the old coordinate system, then the coordinates of P are transformed as (11.1) (x,y,z)-)(x-zo,y-yo,z-zo). One can change the orientation of the coordinate axes by rotating them about the origin. The coordinates of the same point in space are different in the original and rotated rectangular coordinate systems. Algebraic relations between old and new coordinates can be established. A simple case, when a coordinate system is rotated about one of its axes, is discussed in Study Problem 11.2. It is important to realize that no physical or geometrical quantity should depend on the choice of a coordinate system. For example, the length of a straight line segment must be the same in any coordinate system, while the coordinates of its endpoints depend on the choice of the coordinate system. When studying a practical problem, a coordi- nate system can be chosen in any way convenient to describe objects in space. Algebraic rules for real numbers (coordinates) can then be used to compute physical and geometrical characteristics of the objects. The numerical values of these characteristics do not depend on the choice of the coordinate system. 71.4. Distance Between Two Points. Consider two points in space, P1 and P2. Let their coordinates relative to some rectangular coordinate system be (xi, Yi, z1) and (x2, Y2, z2), respectively. How can one calcu- late the distance between these points, or the length of a straight line segment with endpoints P1 and P2? The point P1 is the intersection point of three coordinate planes x = x1, y = yi, and z = z1. The point P2 is the intersection point of three coordinate planes x = x2, y =_Y2, and z= z2. These six planes contain faces of the rectangular box whose largest diagonal is the straight line segment between the points P1 and P2. The question therefore is how to find the length of this diagonal. Consider three sides of this rectangular box that are adjacent, say, at the vertex P1. The side parallel to the x axis lies between the coordinate planes x =x1 and x x2 and is perpendicular to them. So the length of this side is Iz2 - zi. The absolute value is necessary as the difference x2 - x1 may be negative, depending on the values of x1 and x2, whereas the distance must be nonnegative. Similar arguments  71. RECTANGULAR COORDINATES IN SPACE 5 lead to the conclusion that the lengths of the other two adjacent sides are Y2 - Yi and z2 - zil. If a rectangular box has adjacent sides of length a, b, and c, then the length d of its largest diagonal satisfies the equation d2 =a2 + b2 + C2. Its proof is based on the Pythagorean theorem (see Figure 11.2). Con- sider the rectangular face that contains the sides a and b. The length f of its diagonal is determined by the Pythagorean theorem f2 = a2 + b2. Consider the cross section of the rectangular box by the plane that contains the face diagonal f and the side c. This cross section is a rectangle with two adjacent sides c and f and the diagonal d. They are related as d2 f2 + c2 by the Pythagorean theorem, and the desired conclusion follows. Put a = I2- il, b = y2-yil, and c = z2-zil. Then d = P1P2 is the distance between P1 and P2. The distance formula is immediately D P2 D d f P2 P1 B A B a C a A B P1 b C FIGURE 11.2. Distance between two points with coordi- nates Pi = (xi, y1, zi) and P2 = (x2, Y2, z2). The line seg- ment P1P2 is viewed as the largest diagonal of the rectangu- lar box whose faces are the coordinate planes corresponding to the coordinates of the points. Therefore, the distances be- tween the opposite faces are a = Iz - x2|, b = y1 - y2 , and c = zi - z2|. The length of the diagonal d is obtained by the double use of the Pythagorean theorem in each of the indi- cated rectangles: d2 = c2 + f2 (top right) and f2 = a2 + b2 (bottom right).  6 11. VECTORS AND THE SPACE GEOMETRY found: (11.2) |P1P2 I(x2 -xi)2 + (Y2 -1yi)2 + (z2 - zi)2. Note that the numbers (coordinates) (Xi, yi, zi) and (x2, Y2, z2) depend on the choice of the coordinate system, whereas the number |P1P2| re- mains the same in any coordinate system! For example, if the origin of the coordinate system is translated to a point (zo, yo, zo) while the ori- entation of the coordinate axes remains unchanged, then, according to rule (11.1), the coordinates of P1 and P2 relative to the new coordinate become (x1 - mo, yi - yo, zi - zo) and (x2 - mo, y2 - yo, z2 - zo), respec- tively. The numerical value of the distance does not change because the coordinate differences, (x2 - o) - (z1 - o) =X2 - X1 (similarly for the y and z coordinates), do not change. EXAMPLE 11.1. A point moves 3 units of length parallel to a line, then it moves 6 units parallel to the second line that is perpendicular to the first line, and then it moves 6 units parallel to the third line that is perpendicular to the first and second lines. Find the distance between the initial and final positions. SOLUTION: The distance between points does not depend on the choice of the coordinate system. Let the origin be positioned at the ini- tial point of the motion and let the coordinate axes be directed along the three mutually perpendicular lines parallel to which the point has moved. In this coordinate system, the final point has the coordinates (3, 6, 6). The distance between this point and the origin (0, 0, 0) is D = 32+62+62= 9(1+4+4)=9. Rotations in Space. The origin can always be translated to P1 so that in the new coordinate system P1 is (0, 0, 0) and P2 is (x2 - X1, y2 - y1, z2 - zi). Since the distance should not depend on the orientation of the coordinate axes, any rotation can now be described algebraically as a linear transformation of an ordered triple (x, y, z) under which the combination x2 + y2 + z2 remains invariant. A linear transformation means that the new coordinates are linear combinations of the old ones. It should be noted that reflections of the coordinate axes, c - -c (sim- ilarly for y and z), are linear and also preserve the distance. However, a coordinate system obtained by an odd number of reflections of the coordinate axes cannot be obtained by any rotation of the original co- ordinate system. So, in the above algebraic definition of a rotation, the  71. RECTANGULAR COORDINATES IN SPACE 7 reflections should be excluded. An algebraic description of rotations in a plane and in space is given in Study Problems 11.2 and 11.20. 71.5. Spheres in Space. In this course, relations between two equivalent descriptions of objects in space-the geometrical and the algebraic will always be emphasized. One of the course objectives is to learn how to interpret an algebraic equation by geometrical means and how to de- scribe geometrical objects in space algebraically. One of the simplest examples of this kind is a sphere. Geometrical Description of a Sphere. A sphere is a set of points in space that are equidistant from a fixed point. The fixed point is called the center of the sphere. The distance from the sphere center to any point of the sphere is called the radius of the sphere. Algebraic Description of a Sphere. An algebraic description of a sphere implies finding an algebraic condition on coordinates (x, y, z) of points in space that belong to the sphere. So let the center of the sphere be a point Po with coordinates (zo, yo, zo) (defined relative to some rectangular coordinate system). If a point P with coordinates (x, y, z) belongs to the sphere, then the numbers (x, y, z) must be such that the distance |PPol is the same for any such P and equal to the radius of the sphere, denoted R, that is, |PPol = R or |PPol2 = R2 (see Figure 11.3, left panel). Using the distance formula, this condition can be written as (11.3) (x -czo)2+(y-yo)2+(z-zo)2 =R2. For example, the set of points with coordinates (x, y, z) that satisfy the condition x2 + y2 + z2 = 4 is a sphere of radius R = 2 centered at the origin zo = yo = zo = 0. EXAMPLE 11.2. Find the center and the radius of the sphere x2+y2+z2-2x+4y-6z+5=0. SOLUTION: In order to find the coordinates of the center and the radius of the sphere, the equation must be transformed to the standard form (11.3) by completing the squares: x2 - 2x = (x - 1)2 - 1, y2 + 4y (y + 2)2 - 4, and z2 - 6z =(z - 3)2 - 9. Then the equation of the sphere becomes (cc- 1)2 - 1+ (y +2)2 - 4+ (z -3)2 - 9+5 =0, (c- 1)2+ (y +2)2+ (z -3)2 9.  8 11. VECTORS AND THE SPACE GEOMETRY P "0 x FIGURE 11.3. Left: A sphere is defined as a point set in space. Each point P of the set has a fixed distance R from a fixed point Po. The point Po is the center of the sphere, and R is the radius of the sphere. Right: An illustration to Study Problem 11.2. Transformation of coordinates under a rotation of the coordinate system in a plane. A comparison with (11.3) shows that the center is at (xo,yo zo) (1, -2, 3) and the radius is R = 9 f5=3. D 71.6. Algebraic Description of Point Sets in Space. The idea of an alge- braic description of a sphere can be extended to other sets in space. It is convenient to introduce some brief notation for an algebraic descrip- tion of sets. For example, for a set S of points in space with coordinates (x, y, z) such that they satisfy the algebraic condition (11.3), one writes S ={(x,y,z) (x-xo)2+(y-yo)2+(z-zo)2R2}. This relation means that the set S is a collection of all points (x, y, z) such that (the vertical bar) their rectangular coordinates satisfy (11.3). Similarly, the xy plane can be viewed as a set of points whose z coor- dinates vanish: P = {(x,yz) z = 0}. The solid region in space that consists of points whose coordinates are nonnegative is called the first octant: 01 = {(x,y,z) x > 0, y > 0, z > 0}. The spatial region 3 = {(x, y, z) x > 0, y > 0, z > 0, x2 + y2 +z2<4}  71. RECTANGULAR COORDINATES IN SPACE 9 is the collection of all points in the portion of a ball of radius 2 that lies in the first octant. The strict inequalities imply that the boundary of this portion of the ball does not belong to the set B. 71.7. Study Problems. Problem 11.1. Show that the coordinates of the midpoint of a straight line segment are (x1+x2 Y1+y2 zi+Zz2 2 ' 2 ' 2 if the coordinates of its endpoints are (1, yi, zi) and (x2, y2, z2). SOLUTION: Let P1 and P2 be the endpoints and let M be the point with coordinates equal to half-sums of the corresponding coordinates of P1 and P2. One has to prove that MPil =|MP2| = Il1P2|. These two conditions define M as the midpoint. Consider a rectangular box B1 whose largest diagonal is P1M. The length of its side parallel to the x axis is |(xi + x2)/2 - _i 2 - x11/2. Similarly, its sides parallel to the y and z axes have the lengths Y2 - Yi1/2 and Iz2 - zl1/2, respectively. Consider a rectangular box B2 whose largest diagonal is the segment MP2. Then its side parallel to the x axis has the length Iz2 - (X1 + x2)/2| =Iz2 - x1|/2. Similarly, the sides parallel to the y and z axes have lengths lY2 - Yi1/2 and Iz2 - zl1/2, respectively. Thus, the sides of B1 and B2 parallel to each coordinate axis have the same length. By the distance formula (11.2), the diagonals of B1 and B2 must have the same length P2MI =|MPl1. The lengths of the sides of a rectangular box whose largest diagonal is P1P2 are Ix2 - zI, lY2 - Yil, and Iz2 - zl. They are twice as long as the corresponding sides of B1 and B2. If the length of each side of a rectangular box is scaled by a positive factor q, then the length of its diagonal is also scaled by q. In particular, this implies that |MP2| = IP1P2|.D Problem 11.2. Let (z, y, z) be coordinates of a point P. Consider a new coordinate system that is obtained by rotating the z and y axes about the z axis counterclockwise as viewed from the top of the z axis through an angle $. Let (z', y', z') be coordinates of P in the new coor- dinate system. Find the relations between the old and new coordinates. SOLUTION: The height of P relative to the szy plane does not change upon rotation. So z' =z. It is therefore sufficient to consider rota- tions in the szy plane, that is, for points P with coordinates (xv, y, 0). Let r = OP| (the distance between the origin and F) and let 0 be the  10 11. VECTORS AND THE SPACE GEOMETRY angle counted from the positive x axis toward the ray OP counterclock- wise (see Figure 11.3, right panel). Then x = r cosO8 and y = r sinOB (the polar coordinates of P). In the new coordinate system, the an- gle between the positive x' axis and the ray OP becomes 0' =0- # . Therefore, z'= r cosO' = r cos(O - #) = r cos B cos<#+ r sin B sin<# = xcos # + y sin #, y'= r sinO' = r sin(O - #) = r sin B cos # - r cos B sin # =y cos# - x sin #. These equations define the transformation (x, y) - (x', y') of the old coordinates to the new ones. The inverse transformation (z', y') - (x, y) can be found by solving the equations for (x, y). A simpler way is to note that if (x', y') are viewed as "old" coordinates and (x, y) as "new" coordinates, then the transformation that relates them is the rotation through the angle -# (a clockwise rotation). Therefore, the inverse relations can be obtained by swapping the "old" and "new" coordinates and replacing # by -# in the direct relations. This yields x = z' cos#5- y'sin#, y = y'cos#+cz'sin#. Problem 11.3. Give a geometrical description of the set {S = ((X,y, z) X2+ y2+ z2 -4z=0}. SOLUTION: The condition on the coordinates of points that belong to the set contains the sum of squares of the coordinates just like the equation of a sphere. The difference is that (11.3) contains the sum of perfect squares. So the squares must be completed in the above equation and the resulting expression compared with (11.3). One has z2-4z = (z-2)2-4 so that the condition becomes x2+y2+(z-2)2 4. It describes a sphere of radius R = 2 that is centered at the point (coo, o, zo)= (0, 0, 2); that is, the center of the sphere is on the z axis at a distance of 2 units above the cy plane. D Problem 11.4. Give a geometrical description of the set C = {(X,y, z) X2 +y2- 2x-4y 4}. SOLUTION: As in the previous problem, the condition can be written via the sum of perfect squares (cv - 1)2 + (y - 2)2 9 by means of the relations xv2 - 2xc= (cv - 1)2 - 1 and y2 - 4y =(y - 2)2 - 4. In the zvy plane, the inequality describes a disk of radius 3 whose center  71. RECTANGULAR COORDINATES IN SPACE 11 is the point (1, 2, 0). As the algebraic condition imposes no restriction on the z coordinate of points in the set, in any plane z = zo parallel to the xy plane, the x and y coordinates satisfy the same inequality, and hence the corresponding points also form a disk of radius 3 with the center (1, 2, zo). Thus, the set is a solid cylinder of radius 3 whose axis is parallel to the z axis and passes through the point (1, 2, 0). Q Problem 11.5. Give a geometrical description of the set P ={(x, y, z) z(y - z) = 01 . SOLUTION: The condition is satisfied if either z = 0 or y = x. The former equation describes the xy plane. The latter represents a line in the xy plane. Since it does not impose any restriction on the z coordinate, each point of the line can be moved up and down parallel to the z axis. The resulting set is a plane that contains the line y = x in the xy plane and the z axis. Thus, the set P is the union of this plane and the xy plane. D 71.8. Exercises. (1) Find the distance between the following specified points: (i) (1,2,3) and (-1,0,2) (ii) (-1, 3, -2) and (-1, 2, -1) (2) Find the distance from the point (1, 2, 3) to each of the coordinate planes and to each of the coordinate axes. (3) Find the length of the medians of the triangle with vertices A(1, 2, 3), B(-3, 2, -1), and C(-1, -4, 1). (4) Let the set S consist of points (t, 2t, 3t), where -oc < t < oc. Find the point of S that is the closest to the point (3, 2, 1). Sketch the set S. (5) Give a geometrical description of the following sets defined alge- braically and sketch them: (i) x2 + y2 + z2 - 2x + 4y - 6z = 0 (ii) x2 + y2 + z2 > 4 (iii) X2 + y2 + z2 < 4, z > 0 (iv) X2 + y2 - 4y < 0, z > 0 (v) 4 X2 + y2 + z2G 9 (vi) cc2 + y2 ;> 1, cc2 + y2 + z2 < 4 (vii) cc2 + y2 + z2 - 2z < 0, z > 1 (viii) cc2 + y2 + z2 - 2z =0, z =1 (ix) (c- a)(y -b)(z -c) =0 (x) cc| 1, ly| 2, Iz| <3  12 11. VECTORS AND THE SPACE GEOMETRY (6) Sketch each of the following sets and give their algebraic descrip- tion: (i) A sphere whose diameter is the straight line segment AB, where A= (1,2,3) and B = (3, 2, 1). (ii) Three spheres centered at (1, 2, 3) that touch the zy, yz, and xz planes, respectively. (iii) Three spheres centered at (1, -2, 3) that touch the x, y, and z coordinate axes, respectively. (iv) The largest solid cube that is contained in a ball of radius R centered at the origin. Solve the same problem if the ball is not centered at the origin. Compare the cases when the boundaries of the solid are included in the set or excluded from it. (v) The solid region that is a ball of radius R that has a cylin- drical hole of radius R/2 whose axis is at a distance of R/2 from the center of the ball. Choose a convenient coordinate system. Compare the cases when the boundaries of the solid are included in the set or excluded from it. (vi) The part of a ball of radius R that lies between two parallel planes each of which is at a distance of a < R from the center of the ball. Choose a convenient coordinate system. Compare the cases when the boundaries of the solid are included in the set or excluded from it. (7) Consider the points P such that the distance from P to the point (-3, 6, 9) is twice the distance from P to the origin. Show that the set of all such points is a sphere and find its center and radius. (8) Find the volume of the solid bounded by the spheres x2 + y2 + z2 - 6z=0andx2+y2-2y+z2-6z=-9. (9) The solid region is described by the inequalities Iz-al R2. If R < min(b, c), sketch the solid and find its volume. (10) Sketch the set of all points in the sy plane that are equidistant from two given points A and B. Let A and B be (1, 2) and (-2, -1), respectively. Give an algebraic description of the set. Sketch the set of all points in space that are equidistant from two given points A and B. Let A and B be (1, 2, 3) and (-3, -2, -1), respectively. Give an algebraic description of the set. 72. Vectors in Space 72.1. Oriented Segments and Vectors. Suppose there is a pointlike ob- ject moving in space with a constant rate, say, 5 in/s. If the object was initially at a point F1, and in 1 second it arrives at a point F2,  72. VECTORS IN SPACE 13 72. VECTORS IN SPACE 13 then the distance traveled is |P1P2| = 5 m. The rate (or speed) 5 m/s does not provide a complete description of the motion of the object in space because it only answers the question "How fast does the object move?" but it does not say anything about "Where to does the object move?" Since the initial and final positions of the object are known, both questions can be answered, if one associates an oriented segment P1P2 with the moving object. The arrow specifies the direction, "from P1 to P2," and the length |P1P2| defines the rate (speed) at which the object moves. So, for every moving object, one can assign an oriented segment whose length equals its speed and whose direction coincides with the direction of motion. This oriented segment is called a veloc- ity. Consider two objects moving parallel with the same speed. The oriented segments corresponding to the velocities of the objects have the same length and the same direction, but they are still different because their initial points do not coincide. On the other hand, the velocity should describe a particular physical property of the motion itself ("how fast and where to"), and therefore the spatial position where the motion occurs should not matter. This observation leads to the concept of a vector as an abstract mathematical object that represents all oriented segments that are parallel and have the same length. Vectors will be denoted by boldface letters. Two oriented segments AB and CD represent the same vector a if they are parallel and | AB |CD|; that is, they can be obtained from one another by transporting them parallel to themselves in space. A representation of an abstract vector by a particular oriented segment is denoted by the equality a= AB or a = CD. The fact that the oriented segments AB and CD represent the same vector is denoted by the equality AB = CD. 72.2. Vector as an Ordered Triple of Numbers. Here an algebraic repre- sentation of vectors in space will be introduced. Consider an oriented segment AB that represents a vector a (i.e., a = AB). An oriented seg- ment A'B' represents the same vector if it is obtained by transporting AB parallel to itself. In particular, let us take A' =0, where 0 is the origin of some rectangular coordinate system. Then a =AR OR'. The direction and length of the oriented segment OR' is uniquely deter- mined by the coordinates of the point R'. Thus, the following algebraic definition of a vector can be adopted.  14 11. VECTORS AND THE SPACE GEOMETRY z A2 a B2 a3 A3 a B3a Al B1 a x FIGURE 11.4. Left: Oriented segments obtained from one another by parallel transport. They all represent the same vector. Right: A vector as an ordered triple of numbers. An oriented segment is transported parallel so that its ini- tial point coincides with the origin of a rectangular coordi- nate system. The coordinates of the terminal point of the transported segment, (al, a2, a3), are components of the cor- responding vector. So a vector can always be written as an ordered triple of numbers: a = (ai, a2, a3). By construc- tion, the components of a vector depend on the choice of the coordinate system. DEFINITION 11.1. (Vectors). A vector in space is an ordered triple of real numbers: a = (ai , a2 , a3) . The numbers a1, a2, and a3 are called components of the vector a. Consider a point A with coordinates (ai, a2, a3) in some rectangu- lar coordinate system. The vector a = OA = (ai, a2, a3) is called the position vector of A relative to the given coordinate system. This es- tablishes a one-to-one correspondence between vectors and points in space. In particular, if the coordinate system is changed by a rotation of its axes about the origin, the components of a vector a are trans- formed in the same way as the coordinates of a point whose position vector is a. DEFINITION 11.2. (Equality of Two Vectors). Two vectors a and b are equal or coincide if their corresponding com- ponents are equal: a =b <- ai =bi, a2 =b2, as 3 -3 This definition agrees with the geometrical definition of a vector as a class of all oriented segments that are parallel and have the same length. Indeed, if two oriented segments represent the same vector, then, after  72. VECTORS IN SPACE 15 72. VECTORS IN SPACE 15 parallel transport such that their initial points coincide with the origin, their final points coincide too and hence have the same coordinates. By virtue of the correspondence between vectors and points in space, this definition reflects the fact that two same points should have the same position vectors. EXAMPLE 11.3. Find the components of a vector P1P if the coor- dinates of P1 and P2 are (xi,y1,z1) and (x2, y2, z2), respectively. SOLUTION: Consider a rectangular box whose largest diagonal coin- cides with the segment PIP2 and whose sides are parallel to the coor- dinate axes. After parallel transport of the segment so that P1 moves to the origin, the coordinates of the other endpoint are the compo- nents of P1P2. Alternatively, the origin of the coordinate system can be moved to the point P1, keeping the directions of the coordinate axes. Therefore, P1P = (x2 - x1, Y2 - y1, z2 - zi), according to the coordinate transformation law (11.1), where Po0= P1. Thus, in order to find the components of the vector P1P2, one has to subtract the coordinates of the initial point P1 from the corresponding coordinates of the final point P2. D DEFINITION 11.3. (Norm of a Vector). The number |a|| = a +a+a is called the norm of a vector a. By Example 11.3 and the distance formula (11.2), the norm of a vector is the length of any oriented segment representing the vector. The norm of a vector is also called the magnitude or length of a vector. DEFINITION 11.4. (Zero Vector). A vector with vanishing components, 0 = (0, 0, 0), is called a zero vector. A vector a is a zero vector if and only if its norm vanishes, ||a| = 0. Indeed, if a = 0, then ai1= a2 = a3= 0 and hence ||a|| = 0. For the converse, it follows from the condition |a= 0 that a + a + a= 0, which is only possible if ai1 a2 =as3 0, or a =0. Recall that an "if and only if" statement implies two statements. First, if a =0, then a|= 0 (the direct statement). Second, if |a|= 0, then a =0 (the converse statement).  16 11. VECTORS AND THE SPACE GEOMETRY 72.3. Vector Algebra. Continuing the analogy between the vectors and velocities of a moving object, consider two objects moving parallel but with different rates (speeds). Their velocities as vectors are parallel, but they have different magnitudes. What is the relation between the components of such vectors? Take a vector a = (ai, a2, a3). It can be viewed as the largest diagonal of a rectangular box with one vertex at the origin and the opposite vertex at the point (ai, a2, a3). The adjacent sides of the rectangular box have lengths given by the corresponding components of a (modulo the signs if the components happen to be negative). When the lengths of the sides are scaled by a factor s > 0, a new rectangular box is obtained whose largest diagonal is parallel to a. This geometrical observation leads to the following algebraic rule. DEFINITION 11.5. (Multiplication of a Vector by a Number). A vector a multiplied by a number s is a vector whose components are multiplied by s: sa =(sai, sa2, sa3). If s > 0, then the vector sa has the same direction as a. If s < 0, then the vector sa has the direction opposite to a. For example, the vector -a has the same magnitude as a but points in the direction a aa sa "-s - 0< s< 1 -|a|1 a s = -1 sa1 ---.-1< s < 0 FIGURE 11.5. Left: Multiplication of a vector a by a num- ber s. If s > 0, the result of the multiplication is a vector parallel to a whose length is scaled by the factor s. If s < 0, then sa is a vector whose direction is the opposite to that of a and whose length is scaled by Isl. Middle: Construction of a unit vector parallel to a. The unit vector a is a vector parallel to a whose length is 1. Therefore, it is obtained from a by dividing the latter by its length ||a||, that is, a = sa, where s =1/|a. Right: A unit vector in a plane can al- ways be viewed as an oriented segment whose initial point is at the origin of a coordinate system and whose terminal point lies on the circle of unit radius centered at the origin. If 0 is the polar angle in the plane, then a= (cos 0, sin 0, 0).  72. VECTORS IN SPACE 17 72. VECTORS IN SPACE 17 opposite to a. The magnitude of sa is Isa| =I_(sai)2 + (sa2)2 + (sa3)2 = s2 al + a2 + a3 s = s|a|| ; that is, when a vector is multiplied by a number, its magnitude changes by the factor Isl. The geometrical analysis of the multiplication of a vector by a number leads to the following simple algebraic criterion for two vectors being parallel. Two nonzero vectors are parallel if they are proportional: a |b a=sb for some real s. If all the components of the vectors in question do not vanish, then this criterion may also be written as b =la1 a2, as a ||b <- b1 b2 b3 ' which is easy to verify. If, say, b1 = 0, then b is parallel to a when ai = bi= 0 and a2/b2 = a3/b3. Owing to the geometrical interpretation of sb, all points in space whose position vectors are parallel to a given nonzero vector b form a line (through the origin) that is parallel to b. DEFINITION 11.6. (Unit Vector). A vector a is called a unit vector if its norm equals 1, ||a|| = 1. Any nonzero vector a can be turned into a unit vector a that is parallel to a. The norm (length) of the vector sa reads ||sa|| =_|s|||a s||a|| if s > 0. So, by choosing s = 1/lla||, the unit vector in the direction of a is obtained: A=aaa2a= For example, owing to the trigonometric identity cos2 0 + sin20 = 1, any unit vector in the xy plane can always be written in the form a = (cos 0, sin 0, 0), where 0 is the angle counted from the positive x axis toward the vector a counterclockwise (see the right panel of Figure 11.5). Note that, in many practical applications, the compo- nents of a vector often have dimensions. For instance, the components of a position vector are measured in units of length (meters, inches, etc.), the components of a velocity vector are measured in, for exam- ple, meters per second, and so on. The magnitude of a vector a has the same dimension as its components. Therefore, the corresponding unit vector a is dimensionless. It specifies only the direction of a vector a. EXAMPLE 11.4. Let A =(1, 2, 3) and B =(3, 1, 1). Find a =AB, b =B A, the unit vectors a and b, and the vector c =-2AB and its norm.  18 11. VECTORS AND THE SPACE GEOMETRY SOLUTION: By Example 11.3, a = (3-1,2-1,1-3) = (2,-1,-2). The norm of a is ||a| = /22 + (-1)2 + (-2)2 = 9 = 3. The unit vector in the direction of a is a = (1/3)a = (2/3, -1/3, -2/3). Using the rule of multiplication of a vector by a number, c = -2a = -2(2, -1, -2) (-4,2,4) and ||c|| =I||(-2)a| |-2|||a||= 2|a| I 6. The direction of BA is the opposite to AB, and both vectors have the same length. Therefore, b = (-2, 1, 2), b= 3, and b = -a = (-2/3, 1/3, 2/3). D The Parallelogram Rule. Suppose a person is walking on the deck of a ship with speed v m/s. In 1 second, the person goes a distance v from point A to point B of the deck. The velocity vector relative to the deck is v = AB and v =ABI = v (the speed). The ship moves relative to the water so that in 1 second it comes to a point D from a point C on the surface of the water. The ship's velocity vector relative to the water is then u = CD with magnitude =|ull = |CD. What is the velocity vector of the person relative to the water? Suppose the point A on the deck coincides with the point C on the surface of the water. Then the velocity vector is the displacement vector of the person relative to the water in 1 second. As the person walks on the deck along the segment AB, this segment travels the distance u parallel to itself along the vector u relative to the water. In 1 second, the point B of the deck is moved to a point B' on the surface of the water so that the displacement vector of the person relative to the water will be AB'. Apparently, the displacement vector BB' coincides with the ship's velocity u because B travels the distance u parallel to u. This suggests a simple geometrical rule for finding AB' as shown in Figure 11.6. Take the vector AB = v, place the vector u so that its initial point coincides with B, and make the oriented segment with the initial point of v and the final point of u in this diagram. The resulting vector is the displacement vector of the person relative to the surface of the water in 1 second and hence defines the velocity of the person relative to the water. This geometrical procedure is called addition of vectors. Consider a parallelogram whose adjacent sides, the vectors a and b, extend from the vertex of the parallelogram. The sum of the vec- tors a and b is a vector, denoted a + b, that is the diagonal of the parallelogram extended from the same vertex. Note that the parallel sides of the parallelogram represent the same vector (they are parallel and have the same length). This geometrical rule for adding vectors is called the parallelogram rule. It follows from the parallelogram rule  72. VECTORS IN SPACE 19 72. VECTORS IN SPACE 19 c A U D=A' AZ C FIGURE 11.6. Left: Parallelogram rule for adding two vectors. If two vectors form adjacent sides of a parallelo- gram at a vertex A, then the sum of the vectors is a vector that coincides with the diagonal of the parallelogram and originates at the vertex A. Right: Adding several vec- tors by using the parallelogram rule. Given the first vec- tor in the sum, all other vectors are transported parallel so that the initial point of the next vector in the sum co- incides with the terminal point of the previous one. The sum is the vector that originates from the initial point of the first vector and terminates at the terminal point of the last vector. It does not depend on the order of vectors in the sum. that the addition of vectors is commutative: a + b = b+a; that is, the order in which the vectors are added does not matter. To add several vectors (e.g., a + b + c), one can first find a + b by the parallelogram rule and then add c to the vector a + b. Alternatively, the vectors b and c can be added first, and then the vector a can be added to b + c. According to the parallelogram rule, the resulting vector is the same: (a+b) +c= a+ (b+c). This means that the addition of vectors is associative. So several vec- tors can be added in any order. Take the first vector, then move the second vector parallel to itself so that its initial point coincides with the terminal point of the first vector. The third vector is moved parallel so that its initial point coincides with the terminal point of the second vector, and so on. Finally, make a vector whose initial point coincides with the initial point of the first vector and whose terminal point coin- cides with the terminal point of the last vector in the sum. To visualize this process, imagine a man walking along the first vector, then going parallel to the second vector, then parallel to the third vector, and so  20 11. VECTORS AND THE SPACE GEOMETRY on. The endpoint of his walk is independent of the order in which he chooses the vectors. Algebraic Addition of Vectors. DEFINITION 11.7. The sum of two vectors a = (ai, a2, a3) and b = (b1, b2, b3) is a vector whose components are the sums of the cor- responding components of a and b: a + b = (a1+ bi, a2 + b2, a3+ b3). This definition is equivalent to the geometrical definition of adding vectors, that is, the parallelogram rule that has been motivated by studying the velocity of a combined motion. Indeed, put a = OA, where the endpoint A has the coordinates (ai, a2, a3). A vector b rep- resents all parallel segments of the same length |b. In particular, b is one such oriented segment whose initial point coincides with A. Sup- pose that a+b =OC= (ci, c2, c3), where C has coordinates (ci, c2, c3). By the parallelogram rule, b = AC = (c1 - a1, c2 - a2, c3 - a3), where the relation between the components of a vector and the coordinates of its endpoints has been used (see Example 11.3). The equality of two vectors means the equality of the corresponding components, that is, b1 = ci - a1, b2 = c2- a2, and b3 = c3- a3, or ci = a1+ bi, c2 = a2+ b2, and c3 = a3 + b3 as required by the algebraic addition of vectors. Rules of Vector Algebra. Combining addition of vectors with multipli- cation by real numbers, the following simple rule can be established by either geometrical or algebraic means: s(a + b) = sa + sb, (s + t)a = sa + ta. The difference of two vectors can be defined as a - b = a + (-1)b. In the parallelogram with adjacent sides a and b, the sum of vectors a and (-1)b represents the vector that originates from the endpoint of b and ends at the endpoint of a because b + [a + (-1)b] = a in accordance with the geometrical rule for adding vectors; that, is a t b are two diagonals of the parallelogram. The procedure is illustrated in Figure 11.7 (left panel). EXAMPLE 11.5. An object travels 3 seconds with velocity v =(1, 2, 4), where the components are given in meters per second, and then 2 sec- onds with velocity u =(2, 4,1). Find the distance between the initial and terminal points of the motion.  72. VECTORS IN SPACE 21 -b . b sb a a+b c a-b b -b b / a ' FIGURE 11.7. Left: Subtraction of two vectors. The dif- ference a - b is viewed as the sum of a and -b, the vector that has the direction opposite to b and the same length as b. The parallelogram rule for adding a and -b shows that the difference a - b = a + (-b) is the vector that originates from the terminal point of b and ends at the terminal of a if a and b are adjacent sides of a parallelogram; that is, the sum a + b and the difference a - b are the two diagonals of the parallelogram. Right: Illustration to Study Problem 11.6. Any vector in a plane can always be represented as a linear combination of two nonparallel vectors. SOLUTION: Let the initial and terminal points be A and B, respec- tively. Let C be the point at which the velocity was changed. Then AC = 3v and CB = 2u. Therefore, AB = AC+CB = 3v+2u = 3(1,2,4) +2(2,4, 1) (3, 6, 12) + (4, 8, 2) = (7, 14, 14) = 7(1, 2, 2). The distance AB is the length (or the norm) of the vector AB. So AB = 7(1, 2, 2)|| = 7 (1, 2, 2)|| = 7/1 + 4+ 4 = 21 meters. Q 72.4. Study Problems. Problem 11.6. Consider two nonparallel vectors a and b in a plane. Show that any vector c in this plane can be written as a linear combi- nation c = ta + sb for some real t and s. SOLUTION: By parallel transport, the vectors a, b, and c can be moved so that their initial points coincide. The vectors ta and sb are parallel to a and b, respectively, for all values of s and t. Consider the lines La and Lb that contain the vectors a and b, respectively. Construct two lines through the terminal point of c; one is parallel to La and the other to Lb as shown in Figure 11.7 (right panel). The points of intersection of these lines with La and Lb and the initial and terminal points of c form the vertices of the parallelogram whose diagonal is c and whose adjacent sides are parallel to a and b. Therefore, a and b can always be scaled so that ta and sb become the adjacent sides of the constructed  22 11. VECTORS AND THE SPACE GEOMETRY parallelogram. For a given c, the reals t and s are uniquely defined by the proposed geometrical construction. By the parallelogram rule, c = ta+ sb. D Problem 11.7. Find the coordinates of a point B that is at a distance of 6 units of length from the point A(1, -1, 2) in the direction of the vector v = (2, 1, -2). SOLUTION: The position vector of the point A is a = OA = (1, -1, 2). The position vector of the point B is b = a+ sv, where s is a positive number to be chosen such that the length |ABI = s||v|| equals 6. Since |v|| = 3, one finds s = 2. Therefore, b = (1, -1, 2) + 2(2, 1, -2) (5, 1, -2). D Problem 11.8. Consider a straight line segment with the endpoints A(1, 2, 3) and B(-2, -1,0). Find the coordinates of the point C on the segment such that it is twice as far from A as it is from B. SOLUTION: Let a= (1, 2, 3), b = (-1,0,1), and c be position vectors of A, B, and C, respectively. The question is now to express c via a and b. One has c = a + AC. The vector AC is parallel to AB = (-3, -3, -3) and hence AC = sAB. Since |ACI = 2|CBI, |ACI = AB and therefore s = 3. Thus, c = a + 3 AB = a + (b - a)= (-1,0,1). D Problem 11.9. In Study Problem 11.6, let ||a|| = 1 b|= 2, and the angle between a and b be 27/3. Find the coefficients s and t if the vector c has a norm of 6 and bisects the angle between a and b. SOLUTION: It follows from the solution of Study Problem 11.6 that the numbers s and t do not depend on the coordinate system rel- ative to which the components of all the vectors are defined. So choose the coordinate system so that a is parallel to the x axis and b lies in the xy plane. With this choice, a = (1, 0, 0) and b ( lb|| cos(2w/3), ||b|| sin(2w/3), 0) - (-1, v/3, 0). Similarly, c is the vec- tor of length ||c|| = 6 that makes the angle 7/3 with the x axis, and therefore c = (3, 3v/3, 0). Equating the corresponding components in the relation c = ta+ sb, one finds 3 = t - s and 3v/5 = sv/3,or s = 3 and t = 6. Hence, c = 6a + 3b. D Problem 11.10. Suppose the three coordinate planes are all mirrored. A light ray strikes the mirrors. Determine the direction in which the reflected ray will go. SOLUTION: Let u be a vector parallel to the incident ray. Under a reflection from a plane mirror, the component of u perpendicular to  72. VECTORS IN SPACE 23 72. VECTORS IN SPACE 23 the plane changes its sign. Therefore, after three consecutive reflections from each coordinate plane, all three components of u change their signs, and the reflected ray will go parallel to the incident ray but in the exact opposite direction. For example, suppose the ray is reflected first by the xz plane, then by the yz plane, and finally by the xy plane. In this case, u = (21i , 2u3) - (Ui,--u2,23) - (-6i, -2,2s) -+ -6i-t22 -263 _-U. D Remark. This principle is used to design reflectors like the cat's- eyes on bicycles and those that mark the border lines of a road. No matter from which direction such a reflector is illuminated (e.g., by the headlights of a car), it reflects the light in the opposite direction (so that it will always be seen by the driver). 72.5. Exercises. (1) Find the components of each of the following vectors and their norms: (i) The vector that has endpoints A(l, 2, 3) and B(-1, 5, 1) and is directed from A to B (ii) The vector that has endpoints A(1, 2, 3) and B(-1, 5, 1) and is directed from B to A (iii) The vector that has the initial point A(1, 2, 3) and the final point C that is the midpoint of the line segment AB, where B= (-1,5,1) (iv) The position vector of a point P obtained from the point A(-1, 2, -1) by transporting the latter along the vector u (2, 2, 1) 3 units of length and then along the vector w = (-3, 0, -4) 10 units of length (v) The position vector of the vertex C of a triangle ABC in the xy plane if A is at the origin, B = (a, 0, 0), the angle at the vertex B is 7/3, and |BC| = 3a (2) Let a and b be two vectors that are neither parallel nor perpen- dicular. Sketch each of the following vectors: a + 2b, b - 2a, a - jb and 2a + 3b. (3) Let a, b, and c be three vectors in a plane any of which is not parallel to the others. Sketch each of the following vectors: a + (b - c), (a + b) - c, 2a - 3(b + c), and (2a - 3b) - 3c. (4) Let a =(2, -1, -2) and b =(-3, 0, 4). Find unit vectors a and b. Express 6Ai - 15b via a and b. (5) Let a and b be vectors in the zy plane such that their sum c a+b makes the angle wr/3 with a and has length twice the length of a. Find  24 11. VECTORS AND THE SPACE GEOMETRY b if a lies in the first quadrant, makes the angle 7r/3 with the positive x axis, and has length 2. (6) Consider a triangle ABC. Let a be a vector from the vertex A to the midpoint of the side BC, let b be a vector from B to the midpoint of AC, and let c be a vector from C to the midpoint of AB. Use vector algebra to find a+b+c. (7) Let nk, k = 1, 2, ..., n, be unit vectors in a plane such that the smallest angle between the two vectors nk and 6k+1 is 27r/n. What is the sum vn = 61 + n2 + --- + na for an even n? Sketch the sum for n = 1, n = 3, and n = 5. Compare the norms ||vall for n = 1, 3, 5. Investigate the limit of vn as n - oc by studying the limit of ||vnll as n - oo. (8) Let n , k =1, 2, ..., in, be unit vectors as defined in exercise (7). Let W =n11k+1 - nk for k = 1,2,...,n-1 and wn = ni - nn. Find the limit of ||w1||+ ||w2||+ - - - + ||Wn|| as n - 0c. (9) A plane flies at a speed of v mi/h relative to the air. There is a wind blowing at a speed of u mi/h in the direction that makes the angle 0 with the direction in which the plane moves. What is the speed of the plane relative to the ground? (10) Use vector algebra to show that the line segment joining the mid- points of two sides of a triangle is parallel to the third side and half its length. (11) Let a and b be position vectors in the xy plane. Describe the set of all points in the plane whose position vectors r satisfy the condition |r - a|| + ||r - b|| =k, where k > ||a - bl|. (12) Let pointlike massive objects be positioned at P, i= 1, 2, ..., n, and let m2 be the mass at P. The point Po is called the center of mass if m1P0P1 + m2P0P2 + ... + mnPoP = 0. Express the position vector ro of Po via the position vectors r2 of P. In particular, find the center of mass of three point masses, mi1 m2 = m3 = m, located at the vertices of a triangle ABC for A(1, 2, 3), B(-1, 0, 1), and C(1, 1, -1). (13) Consider the graph y = f(x) of a differentiable function and the line tangent to it at a point x = a. Express components of a vector parallel to the line via f'(a) and find a vector perpendicular to the line. In particular, find such vectors for the graph y =--2 at the point x= 1. (14) Let the vectors a, b, and c have fixed lengths a, b, and c, respec- tively, while their directions may be changed. Is it always possible to achieve a + b + c =0? If not, formulate the condition under which it is possible.  73. THE DOT PRODUCT 25 73. THE DOT PRODUCT 25 (15) Let the vectors a and b have fixed lengths, while their directions may be changed. Put cI I|a + bl|. Is it always possible to achieve c_ > c+, or c_ = c+, or c_ < c+? If so, give examples of the corre- sponding relative directions of a and b. (16) A point object travels so that its trajectory is in a plane and consists of straight line segments. The object always makes a turn 900 counterclockwise after traveling a distance d and then travels the dis- tance sd, 0 < s < 1, before making the next turn. If the object travels a distance a before the first turn, how far can the object get from the initial point if it keeps moving forever? Hint: Investigate the compo- nents of the position vector of the object in an appropriate coordinate system. 73. The Dot Product DEFINITION 11.8. (Dot Product). The dot product a - b of two vectors a = (ai, a2, a3) and b = (b1, b2, b3) is a number: a- b =1b1 + a2b2 + a3b3. It follows from this definition that the dot product has the following properties: a-b = b-a, (sa) -"b = s(a - b), a -(b+ c)= a- b+ ac, which hold for any vectors a, b, and c and a number s. The first property states that the order in which two vectors are multiplied in the dot product does not matter; that is, the dot product is commutative. The second property means that the result of the dot product does not depend on whether the vector a is scaled first and then multiplied by b or the dot product a - b is computed first and the result multiplied by s. The third relation shows that the dot product is distributive. EXAMPLE 11.6. Let a = (1, 2, 3), b = (2, -1, 1), and c = (1, 1, -1). Find a.- (2b - 5c). SOLUTION: One has a - b 1 - 2+2 - (-1)+3. 1 2-2+3 = 3 and similarly, a - c =1 + 2 - 3 =0. By the properties of the dot product: a- (2b -5c)= 2a- b -5a-c= 6-0= 6.  26 11. VECTORS AND THE SPACE GEOMETRY 73.1. Geometrical Significance of the Dot Product. As it stands, the dot product is an algebraic rule for calculating a number out of six given numbers that are components of the two vectors involved. The com- ponents of a vector depend on the choice of the coordinate system. Naturally, one should ask whether the numerical value of the dot prod- uct depends on the coordinate system relative to which the components of the vectors are determined. It turns out that it does not. Therefore, it represents an intrinsic geometrical quantity associated with two vec- tors involved in the product. To elucidate the geometrical significance of the dot product, note first the relation between the dot product and the norm (length) of a vector: a-a= a+a2+a =|a||2 or |a = a-a. Thus, if a = b in the dot product, then the latter does not depend on the coordinate system with respect to which the components of a are defined. Next, consider the triangle whose adjacent sides are the vectors a and b as depicted in Figure 11.8 (left panel). Then the other side of the triangle can be represented by the difference c = b - a. The squared length of this latter side is (11.4) c-c = (b -a) -(b -a)= b -b+ a -a -2a -b, where the algebraic properties of the dot product have been used. Therefore, the dot product can be expressed via the geometrical in- variants, namely, the lengths of the sides of the triangle: (11.5) a - b ll(a 2 + ||b1|2 - ||c1|2) This means that the numerical value of the dot product is independent of the choice of a coordinate system. In particular, let us take the coordinate system in which the vector a is parallel to the x axis and the vector b lies in the xy plane as shown in Figure 11.8 (right panel). Let the angle between a and b be 0. By definition, this angle lies in the interval [0, 7]. When 0 = 0, the vectors a and b point in the same direction. When 0 = w/2, they are said to be orthogonal, and they point in the opposite direction if w= . In the chosen coordinate system, a = (lla||, 0, 0) and b= ( b cosO, ||b| sinO8, 0). Hence, (11.6) a"-b= =|a|||b cos0B or cosO a b la||||lb|| Equation (11.6) reveals the geometrical significance of the dot prod- uct. It determines the angle between two oriented segments in space. It provides a simple algebraic method to establish a mutual orientation of two straight line segments in space.  73. THE DOT PRODUCT 27 73. THE DOT PRODUCT 27 y be a =b b ------------ aa bb FIGURE 11.8. Left: Independence of the dot product from the choice of a coordinate system. The dot product of two vectors that are adjacent sides of a triangle can be ex- pressed via the lengths of the triangle sides as shown in (11.5). Right: Geometrical significance of the dot prod- uct. It determines the angle between two vectors as stated in (11.6). Two nonzero vectors are perpendicular if and only if their dot product vanishes. This follows from (11.5) and the Pythagorean theorem: ||a||2+ b|2 c2 for a right- angled triangle. THEOREM 11.1. (Geometrical Significance of the Dot Product). If is the angle between the vectors a and b, then a -b =||| a b cos 0. In particular, two nonzero vectors are orthogonal if and only if their dot product vanishes: alb a-b=0. For a triangle with sides a, b, and c and an angle 0 between sides a and b, it follows from the relation (11.4) that c2 = a2 + b2 -2ab cos0. For a right-angled triangle, c2 = a2 + b2 (the Pythagorean theorem). EXAMPLE 11.7. Consider a triangle whose vertices are A(1, 1, 1), B(-1, 2, 3), and C(1, 4, -3). Find all the angles of the triangle. SoLUTION: Let the angles at the vertices A, B, and C be a, #, and -v, respectively. Then a +#/+7 = 1800. So it is sufficient t find any two angles. To find the angle a, define the vectors a =AB =(-2, 1, 2) and b =AC =(0, 3, -4). The initial point of these vectors is A, and hence the angle between the vectors coincides with a. Since |a|= 3  28 11. VECTORS AND THE SPACE GEOMETRY and b|= 5, by the geometrical property of the dot product, a-b 0+3-8 1 cosa --a153 a cos-1(-1/3) 109.50. To find the angle #Q, define the vectors a = BA = (2, -1, -2) and b = BC = (2, 2, -6) with the initial point at the vertex B. Then the angle between these vectors coincides with 3. Since ||a| = 3, b 2 11, and a - b = 4 - 2+12 = 14, one finds cos#3= 14/(6 11) and #3 cos-'(7/(3 11)) 45.30. Therefore, yr~ 180° - 109.50 - 45.3 25.20. Note that the range of the function cos-1 must be taken from 00 to 1800 in accordance with the definition of the angle between two vectors. D 73.2. Further Geometrical Properties of the Dot Product. COROLLARY 11.1. (Orthogonal Decomposition of a Vector). Given a nonzero vector a, any vector b can be uniquely decomposed into the sum of two orthogonal vectors, one of which is parallel to a: b =b1+ bi, b1 =b -sa, bi = sa, s =b2. la| Indeed, given a and b, put b1 = b - sa and assume that b1 is orthogonal to a, that is, a - b1 = 0. This condition uniquely determines the coefficient s: a - b - sa - a = 0 or s = b - a/||a||2. The vectors b1 and bil are called the orthogonal and parallel components of b relative to the vector a. The vector bil is also called a vector projection of b onto a. The orthogonal decomposition b = b1 + bil is shown in Figure 11.10 (right panel). If a = a/||a|| is the unit vector along a, then bl =-blia, where the coefficient bil = a b/||a| =||b cos 0 is called a scalar projection of b onto a. It is also easy to see from the figure that ||b ill = ||bl sin 0. EXAMPLE 11.8. Let a = (1, -2, 1) and b = (5,1, 9). Find the orthogonal decomposition b = b1 + b relative to the vector a. SOLUTION: One has a-b = 5-2+9 = 12 and ||a||2 = a-a = 1+(-2)2+ 1 =6. Therefore, s =12/6 =2, bi= sa =2(1, -2, 1) =(2, -4, 2), and b1 =b -b1= (5, 1,9) -(2, -4, 2) = (3, 5, 7). The result can also be verified: a -b 3 - 10 + 7 =0; that is, a is orthogonal to b1 as required.D  73. THE DOT PRODUCT 29 73. THE DOT PRODUCT 29 THEOREM 11.2. (Cauchy-Schwarz Inequality). For any two vectors a and b, la - b| < ||a||||bl| where the equality is reached only if the vectors are parallel. This inequality is a direct consequence of the first relation in (11.6) and the inequality | cos 0| < 1. The equality is reached only when 0 = 0 or 0 = r, that is, when a and b are parallel or antiparallel. THEOREM 11.3. (Triangle Inequality). For any two vectors a and b, ||a+b | ||a||+ ||bl|. PROOF. Put ||a|| =a and ||b|| =b so that a-a a =|a2 =a2 and similarly b - b =2. Using the algebraic rules for the dot product, ||a+b||2 = (a+b) - (a+b) = a2+b2+2a-b a2+b2+2ab =(a+b)2, where the Cauchy-Schwarz inequality has been used. By taking the square root of both sides, the triangle inequality is obtained. Q The triangle inequality has a simple geometrical meaning. Consider a triangle with sides a, b, and c. The directions of the vectors are chosen so that c = a + b. The triangle inequality states that the length ||c|| cannot exceed the total length of the other two sides. It is also clear that the maximal length ||c|| = ||a|| + ||b|| is attained only if a and b are parallel and point in the same direction. If they are parallel but point in the opposite direction, then the length ||c|| becomes minimal and coincides with |||a|| - ||b|||. The absolute value is necessary as the length of a may be less than the length of b. This observation can be stated in the following algebraic form: (11.7) ||a|| - ||b|| < I|a+ b|| I|a|| + ||bl|. 73.3. Direction Angles. Consider three unit vectors ei = (1, 0, 0), e2 (0, 1, 0), and e3= (0, 0, 1) that are parallel to the coordinate axes x, y, and z, respectively. By the rules of vector algebra, any vector can be written as the sum of three mutually perpendicular vectors: a = (ai, a2, as) a181 + a202 + a303 . The vectors aie1, a282, and a303 are adjacent sides of the rectangular box whose largest diagonal coincides with the vector a as shown in Figure 11.9 (right panel). Define the angle a that is counted from the positive direction of the x axis toward the vector a. In other words, the angle a is the angle between ei and a. Similarly, the angles #3 and  30 11. VECTORS AND THE SPACE GEOMETRY z a3 e3 a cos a y 'a2 82 a cos a a2 e2 FIGURE 11.9. Left: The direction angles of a vector are de- fined as the angles between the vector and three coordinate axes. Each angle ranges between 0 and 7 and is counted from the corresponding positive coordinate semiaxis toward the vector. The cosines of the direction angles of a vector are components of the unit vector parallel to that vector. Right: The decomposition of a vector a into the sum of three mutu- ally perpendicular vectors that are parallel to the coordinate axes of a rectangular coordinate system. The vector is the diagonal of the rectangular box whose edges are formed by the vectors in the sum. -y are, by definition, the angles between a and the unit vectors 82 and e3, respectively. Then ei*a ai 82 a a2 cosa = = cos = = a e1 a a e3 "a a3 cosy== e131 a a These cosines are nothing but the components of the unit vector parallel to a: 1 a = a = (cos a, cos #, cos y) . a Thus, the angles a, Q, and -y uniquely determine the direction of a vector. For this reason, they are called direction angles. Note that they cannot be set independently because they always satisfy the condition a = 1 or cos2 a + cos2 + cos2'y 1.  73. THE DOT PRODUCT 31 73. THE DOT PRODUCT 31 In practice (physics, mechanics, etc.), vectors are often specified by their magnitude ||a|| = a and direction angles. The components are then found by ai = a cos a, a2 = a cos #, and a3 = a cos y. 73.4. Basis Vectors in Space. A collection of all ordered triples of real numbers (a1, a2, a3) in which the addition, the multiplication by a num- ber, the dot product, and the norm are defined as in vector algebra is also called a three-dimensional Euclidean space. Similarly, a collection of ordered doublets of real numbers is a two-dimensional Euclidean space. As noted, any element of a three-dimensional Euclidean space can be uniquely represented as a linear combination of three partic- ular elements ei = (1, 0, 0), e2 = (0, 1, 0), and e3= (0, 0, 1). They are called the standard basis. There are other triples of vectors with the characteristic property that any vector is a unique linear combi- nation of them. Given any three mutually orthogonal unit vectors 62, i = 1, 2, 3, any vector in space can be uniquely expanded into the sum a = a161+a262+a363, where the numbers a2 are the scalar projections of a onto i. Any such triple of vectors is called an orthonormal basis in space. So with any orthonormal basis one can associate a rectangular coordinate system in which the coordinates of a point are given by the scalar projections of its position vector onto the basis vectors. DEFINITION 11.9. (Basis in Space). A triple of vectors u1, u2, and u3 is called a basis in space if any vector a can be uniquely represented as a linear combination of them: a =a1u1 + a2u2 + a3u3. A basis may contain vectors that are not necessarily orthogonal or unit. For example, a vector in a plane is a unique linear combination of two given nonparallel vectors in the plane (Study Problem 11.6). In this sense, any two nonparallel vectors in a plane define a (nonorthog- onal) basis in a plane. Consider three vectors in space. If one of them is a linear combination of the others, then the vectors are in one plane and called coplanar. Suppose that none of the vectors is a linear com- bination of the other two; that is, they are not coplanar. Such vectors are called linearly independent. Thus, three vectors a, b, and c are linearly independent if and only if the vector equation xa + yb+ zc= 0 has only a trivial solution x =-y =z = -0 because, otherwise, one of the vectors is a linear combination of the others. For example, if x / 0, then a =-(y/z)b - (z/z)c. It can be proved that any three lin- early independent vectors form a basis in space (Study Problems 11.11 and 11.12).  32 11. VECTORS AND THE SPACE GEOMETRY 73.5. Applications of the Dot Product. Static Problems. According to Newton's mechanics, a pointlike object that is at rest remains at rest if the vector sum of all forces applied to it vanishes. This is the fundamental law of statics: F1+F2 +...+ F=O. This vector equation implies three scalar equations that require van- ishing each of the three components of the total force. A system of objects is at rest if all its elements are at rest. Thus, for any element of a system at rest, the scalar projection of the total force onto any vec- tor vanishes. In particular, the components of the total force should vanish in any orthonormal basis or, as a point of fact, they vanish in any basis in space (see Study Problem 11.11). This principle is used to determine either the magnitudes of some forces or the values of some geometrical parameters at which the system in question is at rest. EXAMPLE 11.9. Let a ball of mass m be attached to the ceiling by two ropes so that the smallest angle between the first rope and the ceiling is 01 and the angle 02 is defined similarly for the second rope. Find the magnitudes of the tension forces in the ropes. SOLUTION: The system in question is shown in Figure 11.10 (left panel). The equilibrium condition is T1+T2+G=0. Let ei be a unit vector that is horizontal and directed from left to right and let 82 be a unit vector directed upward. They form an orthonormal basis in the plane. Using the scalar projections, the forces can be expanded in this basis as T1 = -T1 cos 161 + T1 sin 0162, T2 =T2 cosO82e1 + T2 sinB2e2, G = -mge2, where T1 and T2 are the magnitudes of the tension forces. The scalar projections of the total force onto the horizontal and vertical directions defined by ei and e2 should vanish: -T1 cos 01 + T2 cos 02 = 0 , T1 sin 01 + T2 sin 02 - mg = 0, This system is then solved for T1 and T2. By multiplying the first equation by sin 01 and the second by cos 01 and then adding them, one gets Ti2 =mg cos 01/ sin(01+02). Substituting Ti2 into the first equation, the tension T1 =mg cos 02/ sin(01 + 02) is obt ained.D  73. THE DOT PRODUCT 33 61 e2 T1 T2 b a G FIGURE 11.10. Left: Illustration to Example 11.9. At equilibrium, the vector sum of all forces acting on the ball vanishes. The components of the forces are easy to find in the coordinate system in which the x axis is horizontal and the y axis is vertical. Right: The vector c is the vector projection of a vector b onto a. The line through the terminal points of b and c is perpendicular to a. The scalar projection of b onto a is b cos 0, where 0 is the angle between a and b. It is positive if 0 < 7/2, vanishes if 0 = 7/2, or is negative if 0 > 7/2. Work Done by a Force. Suppose that an object of mass m moves with speed v. The quantity K = mv2/2 is called the kinetic energy of the object. Suppose that the object has moved along a straight line segment from a point P1 to a point P2 under the action of a constant force F. A law of physics states that a change in an object's kinetic energy is equal to the work W done by this force: K2-K1=F-P1P2=W, where K1 and K2 are the kinetic energies at the initial and final points of the motion, respectively. EXAMPLE 11.10. Let an object slide on an inclined plane without friction under the gravitational force. The magnitude of the gravita- tional force is equal to mg, where m is the mass of the object and g is a universal constant for all objects near the surface of the Earth, g ~ 9.8 m/s2. Find the final speed v of the object if the relative height of the initial and final points is h and the object was initially at rest. SOLUTION: Choose the coordinate system so that the displacement vector P1P2 and the gravitational force are in the xy plane. Let the y axis be vertical so that the gravitational force is F = (0, -mg, 0), where m is the mass and g is the acceleration of the free fall. The initial point is chosen to have the coordinates (0, h, 0) while the final point  34 11. VECTORS AND THE SPACE GEOMETRY is (L, 0, 0), where L is the distance the object travels in the horizontal direction while sliding. The displacement vector is P1P2 = (L, -h, 0). Since K1 = 0, one has rv2 2= W = FP1P2 = (0, -mg,0) - (L, -h, 0) = mgh - v = /2gh. Note that the speed is independent of the mass of the object and the inclination angle of the plane (its tangent is h/L); it is fully determined by the relative height only. D 73.6. Study Problems. Problem 11.11. (General Basis in Space). Let u2, i = 1, 2, 3, be three linearly independent (non-coplanar) vectors. Show that they form a basis in space; that is, any vector a can be uniquely expanded into the sum a = u11 + s2u2 + 331. The numbers s2 are called components of a relative to the basis u2. SOLUTION: A solution employs the same approach as in the solution of Study Problem 11.6. Let P1 be the parallelogram with adjacent sides u2 and u3, and P2 be the parallelogram with sides ui and u3, and P3 be the parallelogram with sides ui and u2. Consider a box whose faces are the parallelograms P1, P2, and P3. This box is called a parallelepiped. By the rules of vector algebra, the largest diagonal of the parallelepiped is the sum u1i+u2+u3. Let the vectors u2 and a vector a have common initial point. Consider three planes through the terminal point of a that are parallel to the parallelograms P1, P2, P3 and similar planes through the initial point of a. These six planes enclose a parallelepiped whose largest diagonal is the vector a and whose adjacent sides are parallel to the vectors u2 and therefore are proportional to them; that is, the adjacent edges are the vectors s1u1, s2u2, and 331, where the numbers si, s2, and 33 are uniquely determined by the proposed construction of the parallelepiped. Hence, a =311u + s2u2 + s3u3. Note that the same geometrical construction has been used to expand a vector in an orthonormal basis e, as shown in Figure 11.9.D Problem 11.12. Letui (1, 1,0), 112 =(1,0, 1), andus =(2, 2,1). Show that these vectors are linearly independent and hence form a basis in space. Find the components of a =(1, 2, 3) in this basis.  73. THE DOT PRODUCT 35 73. THE DOT PRODUCT 35 SOLUTION: If the vectors ui are not linearly independent, then there should exist numbers ci, c2, and c3 that do not simultaneously vanish such that cil + c2u2 + c3u3= 0. Indeed, this algebraic condition means that one of the vectors is a linear combination of the other two whenever c2 do not vanish simultaneously. This vector equation can be written in the components: r ci + c2 + 2c3 = 0 ci+c2+3c3 0 ci + 2c3=0 { ci = -2c3 c2 + c3 = 0 c2 = -c3 The substitution of the last two equations into the first one yields -C3 - 2c3 + 2c3 = 0 or c3 = 0 and hence ci1= c2 = 0. Thus, the vectors u2 are linearly independent and form a basis in space. For any vector a =siui + s2u2 + s3u3, the numbers s2, i = 1, 2, 3, are components of a in the basis u2. By writing this vector equation in components relative to the standard basis for a = (1, 2, 3), the system of equations is obtained: r + s2 + 2s3= 1 8i8+s2+2s3 1 Si +2s3= 2 a/2 and c > b/2). Find the angles between the line segments connecting P with the vertices of the triangle. Hint: Consider vectors with the initial point P and terminal points at the vertices of the triangle. (16) Show that the vectors ui= (1, 1, 2), u2 = (1, -1, 0), and u3 (2, 2, -2) are mutually orthogonal. For a vector a = (4, 3, 4), find the scalar orthogonal projections of a onto u2, i = 1, 2, 3, and the numbers s2 such that a =siu + s2u2 + 33. (17) A point object traveled 3 meters from a point A in a particular direction, then it changed the direction by 600 and traveled 4 meters, and then it changed the direction again so that it was traveling at 600 with each of the previous two directions. If the last stretch was 2 me- ters long, how far from A is the object? (18) Two balls of the same mass m are connected by a piece of rope of length h. Then the balls are attached to different points on a horizontal ceiling by a piece of rope with the same length h so that the distance L between the points is greater than h but less than 3h. Find the equilibrium positions of the balls and the magnitude of tension forces in the ropes. (19) A ball of mass m is attached by three ropes of the same length a to a horizontal ceiling so that the attachment points on the ceiling form a triangle with sides of length a. Find the magnitude of the tension force in the ropes. (20) Four dogs are at the vertices of a square. Each dog starts running toward its neighbor on the right. The dogs run with the same speed v. At every moment of time, each dog keeps running in the direction of its right neighbor (its velocity vector always points to the neighbor). Eventually, the dogs meet in the center of the square. When will this happen if the sides of the square have length a? What is the distance traveled by each dog? Hint: Is there a particular direction relative to which the velocity vector of a dog has the same component at each moment of time?  38 11. VECTORS AND THE SPACE GEOMETRY 74. The Cross Product 74.1. Determinant of a Square Matrix. DEFINITION 11.10. The determinant of a 2 x 2 matrix is the number computed by the following rule: det (a=a12 an1a22 - a12a21, a21 a22 that is, the product of the diagonal elements minus the product of the off-diagonal elements. DEFINITION 11.11. The determinant of a 3 x 3 matrix A is the number obtained by the following rule: a11 a12 a13 det a21 a22 a23J= a11 det Anl - a12 det A12 + a13 det A13 a31 a32 a33 3 Z(1)k+1alk det Alk, k=1 Anl= a22 a23, A12 a21 a23,Al a21 a22 3231 33f 31 a32 where the matrices A1k, k = 1, 2, 3, are obtained from the original matrix A by removing the row and column containing the element alk. It is straightforward to verify that the determinant can be expanded over any row or column: 3 det A =Z(1)mamk det Amk for any m = 1, 2, 3, k=1 3 det A = (1)mamkdet Amk for any k= 1,2,3, m=1 where the matrix Amk is obtained from A by removing the row and column containing amk. This definition of the determinant is extended recursively to N x N square matrices by letting k and m range over 1,2,...,N . In particular, the determinant of a triangular matrix (i.e., the ma- trix all of whose elements either above or below the diagonal vanish) is the product of its diagonal elements: det(0 a2 d =det b a2 0 = aia2a3 0 0 s cd as  74. THE CROSS PRODUCT 39 for any numbers b, c, and d. Also, it follows from the expansion of the determinant over any column or row that, if any two rows or any two columns are swapped in the matrix, its determinant changes sign. For 2 x 2 matrices, this is easy to see directly from Definition 11.10. In general, if the matrix B is obtained from A by swapping the first and second rows, that is, bik = a2k and b2k = alk, then the matrices B2k and Alk coincide and so do their determinants. By expanding det B over its second row b2k = alk, one infers that 3 3 det B =Z(-1)2+k b2k det B2k Z(-1)2kaik det Alk k=1 k=1 3 (-1)+kalkdet Aik = -det A. k=1 This argument can be applied to any two rows or columns in a square matrix of any dimension. EXAMPLE 11.11. Calculate det A, where 123 A= 013 . -121 SOLUTION: Expanding the determinant over the first row yields det A =1(1. 1 - 2 3) - 2(0. 1 - (-1) 3)+3(0. 2 - (-1) 1) =-8. Alternatively, expanding the determinant over the second row yields the same result: det A =-0(2. 1 - 3 . 2) +1(1. 1 - (-1) - 3) - 3(1 . 2 - (-1) 2) =-8. One can check that the same result can be obtained by expanding the determinant over any row or column. D 74.2. The Cross Product of Two Vectors. DEFINITION 11.12. (Cross Product). Let ei1 (1,0, 0), e2 =(0, 1, 0), and 63 (0, 0, 1). The cross product of two vectors a =(a1, a2, a3) and b =(b1, b2, b3) is a vector that is the  40 11. VECTORS AND THE SPACE GEOMETRY determinant of the formal matrix expanded over the first row: (ei e2 e3 axb=det a1 a2 a3 bl b2 b3 = detb ei -det jb, 3 e2 + det (a, b2e3 (11.8) = (a2b3 - a3b2, a3b1 - a1b3, a1b2 - a2b1). Note that the first row of the matrix consists of the unit vectors parallel to the coordinate axes rather than numbers. For this reason, it is referred as to a formal matrix. The use of the determinant is merely a compact way to write the algebraic rule to compute the components of the cross product. EXAMPLE 11.12. Evaluate the cross product a x b if a = (1, 2, 3) and b = (2, 0, 1). SOLUTION: By definition, (ei 62 e3 (1, 2, 3) x (2,0,1) = det 1 2 3 2 0 1 = det (23\)ei - det ( 83\)2e2 + det (2)e = (2 - 0)ei - (1- 6)e2 + (0 - 4)e3 = 281 + 582 - 483 = (2,5, -4). The cross product has the following properties that follow from its definition: a x b -b x a, (a+c) x b ax b+c x b, (sa) x b = s(a x b). The first property is obtained by swapping the components of b and a in (11.8). Alternatively, recall that the determinant of a matrix changes its sign if two rows are swapped in the matrix (the rows a and b in Definition 11.12). So the cross product is skew-symmetric; that is, it is not commutative, and the order in which the vectors are multiplied is essential. Changing the order leads to the opposite vector. In particular, if b =a, then a x a =-a x a or 2(a x a) =0 or ax a=0.  74. THE CROSS PRODUCT 41 The cross product is distributive according to the second property. To prove this change ai to ai + c , i = 1, 2, 3, in (11.8). If a vector a is scaled by a number s and the resulting vector is multiplied by b, the result is the same as the cross product a x b computed first and then scaled by s (change a2 to sae in (11.8) and then factor out s). The double cross product satisfies the so called bac - cab rule (11.9) a x (b x c) = b(a - c) - c(a - b) and the Jacobi identity (11.10) a x (b x c) + b x (c x a) + c x (a x b) = 0. Note that the second and third terms on the left side of (11.10) are obtained from the first by cyclic permutations of the vectors. The proofs of the bac - cab rule and the Jacobi identity are given in Study Problems 11.16 and 11.17. The Jacobi identity implies that ax(bxc)/ (axb)xc. This means that the multiplication of vectors defined by the cross prod- uct is not associative in contrast to multiplication of numbers. This observation is further discussed in Study Problem 11.18. 74.3. Geometrical Significance of the Cross Product. The above alge- braic definition of the cross product uses a particular coordinate sys- tem relative to which the components of the vectors are defined. Does the cross product depend on the choice of the coordinate system? To answer this question, one should investigate whether both its direction and its magnitude depend on the choice of the coordinate system. Let us first investigate the mutual orientation of the oriented segments a, b, and a x b. A simple algebraic calculation leads to the following result: a -(a x b) a= 1(a2b3 - a3b2) + a2(a3bi - a1b3) + a3(aib2 - a2b1) = 0. By the skew symmetry of the cross product, it is also concluded that b - (a x b) =-b.- (b x a) =0. By the geometrical property of the dot product, the cross product must be perpendicular to both vectors a and b: (11.11) a-(a xb) =b-(a xb) =0 <- a x bla and a x blb.  42 11. VECTORS AND THE SPACE GEOMETRY Let us calculate the length of the cross product. By the definition (11.8), la x b|2 =(a x b). (a x b) = (a2b3 - a3b2)2 + (a3b1 - a1b3)2 + (aib2 - a2b1)2 (ai + a2 + a3)(bi + b2 + b3) - (aibi + a2b2 + a3b3)2 = ||a||2||b||2 -(a - b)2 where the third equality is obtained by computing the squares of the components of the cross product and regrouping terms in the obtained expression. The last equality uses the definitions of the norm and the dot product. Next, recall the geometrical property of the dot product (11.6). If 0 is the angle between the vectors a and b, then la x b|2 =|a2b2 - ||a|2b||2 cos2Oe = a|2|b|2(1 - cos2Oe) = ||a|2|b||2sin2Oe. Since 0 < 0 < r, sin 0;> 0 and the square root of both sides of this equation can be taken with the result that la x b||= ||a||||bl sinO. This relation shows that the length of the cross product defined by (11.8) does not depend on the choice of the coordinate system as it is expressed via the geometrical invariants, the lengths of a and b and the angle between them. Now consider the parallelogram with adjacent sides a and b. If ||a|| is the length of its base, then h = ||bl sin0 is its height. Therefore, the norm of the cross product, ||a x b = a|h = A, is the area of the parallelogram. Owing to the mutual orientation of the vectors a, b, and a x b / 0 established in (11.11) as well as that their lengths are preserved under rotations of the coordinate system, the coordinate system can be oriented so that a is along the x axis, b is in the xy plane, while a x b is parallel to the z axis. In this coordinate system, a = (lla|, 0, 0) and b = (bi, b2, 0), where b1 = ||b|| cosOB and b2 =|b sinOB if b lies in either the first or second quadrant of the xy plane and b2 = -|blsin 0 if b lies in either the third or fourth quadrant. In the former case, a x b = (0, 0, A), where A is the area of the parallelogram. In the latter case, the definition (11.8) yields a x b = (0, 0, -A). It turns out that the direction of the cross product in both cases can be described by a simple rule known as the right-hand rule: If the fingers of the right hand curl in the direction of a rotation from a toward b through the smallest angle between them, then the thumb points in the direction of a x b.  74. THE CROSS PRODUCT 43 -- a b e3 =81i x 52 xbA Ae2 e3 x ei 81=8e2 xe3 FIGURE 11.11. Left: Geometrical interpretation of the cross product of two vectors. The cross product is a vec- tor that is perpendicular to both vectors in the product. Its length equals the area of the parallelogram whose adjacent sides are the vectors in the product. If the fingers of the right hand curl in the direction of a rotation from the first vector to the second vector through the smallest angle between them, then the thumb points in the direction of the cross product of the vectors. Right: Illustration to Study Problem 11.15. In particular, by Definition 11.12, ei x e2 = (1, 0, 0) x (0, 1, 0) (0, 1) = e3. If a is orthogonal to b, then the relative orientation of the triple of vectors a, b, and a x b is the same as that of the standard basis vectors ei, e2, and e3. The stated geometrical properties are depicted in the left panel of Figure 11.11 and summarized in the following theorem. THEOREM 11.4. (Geometrical Significance of the Cross Product). The cross product a x b of vectors a and b is the vector that is per- pendicular to both vectors, a x b I a and a x b I b, has a magnitude equal to the area of the parallelogram with adjacent sides a and b, and is directed according to the right-hand rule. Two useful consequences can be deduced from this theorem. COROLLARY 11.2. Two nonzero vectors are parallel if and only if their cross product vanishes: a x b = 0 a b. If a x b =0, then the area of the corresponding parallelogram vanishes, |a x b|= 0, which is only possible if the adjacent sides of the parallelogram are parallel. Conversely, for two parallel vectors, there is a number s such that a = sb. Hence, ax b =(sb) xb =s(b xb) =O.  44 11. VECTORS AND THE SPACE GEOMETRY If in the cross product a x b the vector b is changed by adding to it any vector parallel to a, the cross product does not change: a x (b+ sa) a x b + s(a x a) a x b. Let b = b1 + b be the orthogonal decomposition of b relative to a nonzero vector a. By Corollary 11.1, a x b = 0 because b is parallel to a. It is then concluded that the cross product depends only on the component b1 of b that is orthogonal to a. Thus, a x b = a x b1 and la xb ||= ||a|||Il l. Area of a Triangle. One of the most important applications of the cross product is in calculations of the areas of planar figures in space. COROLLARY 11.3. (Area of a Triangle). Let vectors a and b be two sides of a triangle that have the same initial point at a vertex of a triangle. Then the area of the triangle is 1 Area A = |a x bl|. 2 Indeed, by the geometrical construction, the area of the triangle is half of the area of a parallelogram with adjacent sides a and b. EXAMPLE 11.13. Let A = (1, 1,1), B = (2, -1,3), and C = (-1,3,1). Find the area of the triangle ABC and a vector orthogo- nal to the plane that contains the triangle. SOLUTION: Take two vectors with the initial point at any of the vertices of the triangle that form the adjacent sides of the triangle at that vertex. For example, a = AB = (1, -2, 2) and b = AC = (-2, 2, 0). Then ax b = (-4, -4, -6). Since ||(-4, -4, -6)|| = 2||(2, 2, 3)|= 2 17, the area of the triangle ABC is 17 by Corollary 11.3. The units here are squared units of length used to measure the coordinates of the triangle vertices (e.g., m2 if the coordinates are measured in meters). Any vector in the plane that contains the triangle is a linear combination of a and b. Therefore, the vector a x b is orthogonal to any such vector and hence to the plane because a x b is orthogonal to both a and b. The choice a = CB and b = CA or a = BA and b = BC would give the same answer (modulo the sign change in the cross product). n Applications in Physics: Torque. Torque, or moment of force, is the tendency of a force to rotate an object about an axis or a pivot. Just as a force is a push or a pull, a torque can be thought of as a twist.  74. THE CROSS PRODUCT 45 If r is the vector from a pivot point to the point where a force F is applied, then the torque is defined as the cross product r r x F. The torque depends only on the component F1 of the force that is orthogonal to r, that is, r = r x F1. If 0 is the angle between r and F, then the magnitude of the torque is r = | F = rF sin 0; here r = ||r|| is the distance from the pivot point to the point where the force of magnitude F is applied. One can think of r as a lever attached to a pivot point and the force F is applied to the other end of the lever to rotate it about the pivot point. Naturally, the lever would not rotate if the force is parallel to it (0 = 0 or 0 = fr), whereas the maximal rotational effect is created when the force is applied in the direction perpendicular to the lever (0 w= /2). The direction of -r determines the axis about which the lever rotates. By the property of the cross product, this axis is perpendicular to the plane containing the force and position vectors. According to the right-hand rule, the rotation occurs counterclockwise when viewed from the top of the torque vector. When driving a car, a torque is applied to the steering wheel to change the direction of the car. When a bolt is tightened by applying a force to a wrench, the produced turning effect is the torque. An extended object is said to be rigid if the distance between any two of its points remains constant in time regardless of the external forces exerted on it. Let P be a fixed (pivot) point about which a rigid object can rotate. Suppose that the forces F, i = 1, 2, ..., n, are applied to the object at the points whose position vectors relative to the point P are r2. The principle of moments states that a rigid object does not rotate about the point P if it was initially at rest and the total torque vanishes: -r=-r1+r2+---+ rn=r1 x F1+r2 x F2+---+rn x Fn=0. If, in addition, the total force vanishes, F = F1 + F2 + - - - + Fn = 0, then a rigid object remains at rest and will not rotate about any other pivot point. Indeed, suppose that the torque about P vanishes and let ro be a position vector of P relative to another point P0. Then the position vectors of the points at which the forces are applied relative to the new pivot point Po are r2 + ro. The total torque, or the total moment of the forces, about Po also vanishes: ro =(r1+ ro) xF1+.- -+ (r + ro) xFn = r1x F1+.- -+ rnx Fa+ ro x(F1+- -+ Fn) = r + ro x F =0  46 11. VECTORS AND THE SPACE GEOMETRY because, by the hypothesis, r = 0 and F = 0. The conditions r = 0 and F = 0 comprise the fundamental law of statics for rigid objects. EXAMPLE 11.14. The ends of rigid rods of length L1 and L2 are rigidly joined at the angle 7/2. A ball of mass m1 is attached to the free end of the rod of length L1 and a ball of mass m2 is attached to the free end of the rod of length L2. The system is hung by the joining point so that the system can rotate freely about it under the gravitational force. Find the equilibrium position of the system if the masses of the rods can be neglected as compared to the masses of the balls. SOLUTION: The gravitational forces have magnitudes F1 = mig and F2 = m2g for the first and second balls, respectively (g is the free fall acceleration). They are directed downward and therefore lie in the plane that contains the position vectors of the balls relative to the pivot point. So the torques of the gravitational forces are orthogonal to this plane, and the equilibrium condition T1 + -r2= 0 is equivalent to T -72 = 0, where 71,2 are the magnitudes of the torques. The minus sign follows from the right-hand rule by which the vectors r1 and T2 are parallel but have opposite directions. In other words, the gravitational forces applied to the balls generate opposite rotational moments. When the latter are equal in magnitude, the system is at rest. In the plane that contains the system, let 01 and 02 be the smallest angles between the rods and a horizontal line. Since the rods are perpendicular, 01 + 02 = 7/2. The angle between the position vector of the first ball and the gravitational force acting on it is #1 = 7/2 - 01, and similarly #2 = 7/2 - 02 is the angle between the position vector of the second ball and the gravitational force acting on it. Therefore, Ti= L1F1 sin #1 and 72 = L2F2 sin #2. Owing to the identity sin(w/2 - 0) = cos 0, it follows that m1L1 1 = 72 7m1L1cosO1 =m2L2 cosOB2 tan01 =m , where the relation 02 w= /2 - 01 has been used. D 74.4. Study Problems. Problem 11.14. Find the most general vector r that satisfies the equa- tions a - r = 0 and b - r = 0, where a and b are nonzero, nonparallel vectors. SOLUTION: The conditions imposed on r hold if and only if the vector r is orthogonal to both vectors a and b. Therefore, it must be parallel to their cross product. Thus, r =t(a x b) for any real t. D  74. THE CROSS PRODUCT 47 Problem 11.15. Use geometrical means to find the cross products of the unit vectors parallel to the coordinate axes. SOLUTION: Consider 01 x e2. Since i 1e02 and 1di 2|| = 1, their cross product must be a unit vector perpendicular to both ei and 62. There are only two such vectors, te3. By the right-hand rule, ei x e2 = e3. Similarly, the other cross products are shown to be obtained by cyclic permutations of the indices 1, 2, and 3 in the above relation. A permu- tation of any two indices leads to a change in sign (e.g., 62 x 81 = -83). Since a cyclic permutation of three indices {ijk} -- {kij} (and so on) consists of two permutations of any two indices, the relation between the unit vectors can be cast in the form e2 = ej x ek , {ijk} = {123} and cyclic permutations. Problem 11.16. Prove the bac - cab rule (11.9). SOLUTION: If c and b are parallel, b = sc for some real s, then the relation is true because both its sides vanish. If c and b are not parallel, then, by the remark after Corollary 11.2, the double cross product a x (b x c) depends only on the component of a that is orthogonal to b x c. This component lies in the plane containing b and c and hence is a linear combination of them (see Study Problem 11.6). So, without loss of generality, a = tb+pc. Also, b x c = b x c1, where ci = c -sb, s = c - b/||b||2, is the component of c orthogonal to b (note b - c1 = 0). The vectors b, c1, and b x c1 are mutually orthogonal and oriented according to the right-hand rule. In particular, ||b x cl| =||b||||c1l|. By applying the right-hand rule twice, it is concluded that b x (b x c1) has the direction opposite to c1. Since b and b x c1 are orthogonal, |b x (b x c)| =|b|b x cl| =||b||2||c11l.Therefore, b x (b x c) = -c1l b||2 = b(b - c) - c(b - b). By swapping the vector b and c in this equation, one also obtains c x (b x c) = -c x (c x b) = b1ilc||2 = -c(c - b) + b(c - c). It follows from these relations that a x(b x c)= tb x (b x c)+ pc x (b x c) =b[(tb + pc) - c] - c[(tb + pc) - b] =b(a -c) - c(a -b).  48 11. VECTORS AND THE SPACE GEOMETRY Problem 11.17. Prove the Jacobi identity (11.10). SOLUTION: By the bac - cab rule (11.9) applied to each term, a x (b x c) = b(a - c) - c(a - b), b x (c x a) = c(b - a) - a(b.- c), c x (a x b) = a(c - b) - b(a.- c). By adding these equalities, it is easy to see that the coefficients at each of the vectors a, b, and c on the right side are added up to make 0. D Problem 11.18. Consider all vectors in a plane. Any such vector a can be uniquely determined by specifying its length a = ||a| and the angle 9a that is counted counterclockwise from the positive x axis toward the vector a (i.e., 0 < Qa < 27). The relation (a1, a2) (a cos ea, a sin Oa) establishes a one-to-one correspondence between or- dered pairs (a1, a2) and (a, Qa). Define the vector product of two vectors a and b as the vector c for which c = ab and B 0=a + 0b. Show that, in contrast to the cross product, this product is associative and com- mutative, that is, that c does not depend on the order of vectors in the product. SOLUTION: Let us denote the vector product by a small circle to distinguish it from the dot and cross products, a o b = c. Since c = (ab cos(Ba + Ob), ab sin(Ba + Ob)), the commutativity of the vector product a o b = b o a follows from the commutativity of the product and addition of numbers: ab = ba and ea + 0b =0b + ea. Similarly, the associativity of the vector product (a o b) o c = a o (b o c) follows from the associativity of the product and addition of ordinary numbers: (ab)c = a(bc) and (9a + Bb) + Oc Q= a + (Ob + ec). D Remark. The vector product introduced for vectors in a plane is known as the product of complex numbers, which can be viewed as two- dimensional vectors. It is interesting to note that no commutative and associative vector product (i.e., "vector times vector =vector") can be defined in a Euclidean space of more than two dimensions. Problem 11.19. Let u be a vector rotating in the xy plane about the z axis. Given a vector v, find the position of u such that the magnitude of the cross product v x u is maximal. SOLUTION: For any two vectors, |v x ul v ||||ul sin 0, where 0 is the angle between v and u. The magnitude of v is fixed, while the magnitude of u does not change when rotating. Therefore, the absolute maximum of the cross-product magnitude is reached when sin 0 1 or cos 0 0 (i.e., when the vectors are orthogonal). The  74. THE CROSS PRODUCT 49 corresponding algebraic condition is v -u = 0. Since u is rotating in the xy plane, its components are u = (llull cos #, ||ull sin #, 0), where 0 < # < 2w is the angle counted counterclockwise from the x axis toward the current position of u. Put v = (vi, v2, v3). Then the direction of u is determined by the equation v - u = ||ull(vI cos # + v2 sin #) = 0, and hence tan #y= -vI/v2. This equation has two solutions in the range 0 < # < 2w: #= -tan-1(v1/v2) and #= -tan-1(v1/v2) +w. Geometrically, these solutions correspond to the case when u is parallel to the line v2y + v1x = 0 in the xy plane. D 74.5. Exercises. (1) Find the cross product a x b if (i) a= (1,2,3) and b = (-1, 0, 1) (ii) a = (1, -1, 2) and b = (3, -2, 1) (iii) a = ei + 3e2 - e3 and b = 3ei - 2e2 + e3 (iv) a =2c - d and b = 3c + 4d, where c x d= (1, 2, 3) (2) Let a = (3, 2, 1), b = (-2, 1, -1), and c = (1, 0, -1). Find a x (b x c), b x (c x a), and c x (a x b). (3) Let a be a unit vector orthogonal to b and c. If c = (1, 2, 2), find the length of the vector a x [(a+ b) x (a+ b+ c)]. (4) Given two nonparallel vectors a and b, show that the vectors a, a x b, and a x (a x b) are mutually orthogonal. (5) Suppose a lies in the xy plane, its initial point is at the origin, and its terminal point is in first quadrant of the xy plane. Let b be parallel to e3. Use the right-hand rule to determine whether the angle between a x b and the unit vectors parallel to the coordinate axes lies in the interval (0, 7/2) or (7/2, 7) or equals 7/2. (6) If vectors a, b, and c have the initial point at the origin and lie, respectively, in the positive quadrants of the xy, yz, and xz planes, find the octants in which the pairwise cross products of these vectors lie. (7) Find the area of a triangle ABC for A(1, 0, 1), B(1, 2, 3), and C(0, 1, 1) and a nonzero vector orthogonal to the plane containing the triangle. (8) Use the cross product to show that the area of the triangle whose vertices are midpoints of the sides of a triangle with area A is A/4. (9) Consider a triangle whose vertices are midpoints of any three sides of a parallelogram. If the area of the parallelogram is A, find the area of the triangle. (10) Let A =(1, 2, 1) and B =(-1, 0, 2) be vertices of a parallelo- gram. If the other two vertices are obtained by moving A and B by  50 11. VECTORS AND THE SPACE GEOMETRY 3 units of length along the vector a = (2, 1, -2), find the area of the parallelogram. (11) Consider four points in space. Suppose that the coordinates of the points are known. Describe a procedure based on the properties of the cross product to determine whether the points are in one plane. In par- ticular, are the points (1, 2, 3), (-1, 0, 1), (1, 3, -1), and (0, 1, 2) in one plane? (12) Let the sides of a triangle have lengths a, b, and c and let the angles at the vertices opposite to the sides a, b, and c be, respectively, a, 3, and T. Prove that sin a sin3 # sin y a b c Hint: Define the sides as vectors and express the area of the triangle via the vectors at each vertex of the triangle. (13) Consider a polygon with four vertices A, B, C, and D. If the coordinates of the vertices are specified, describe the procedure based on vector algebra to calculate the area of the polygon. In particular, put A = (0, 0), B = (xi, yi), C = (x2, Y2), and D =(X3, y3) and express the area via x and y, i = 1, 2, 3. (14) Consider a parallelogram. Construct another parallelogram whose adjacent sides are diagonals of the first parallelogram. Find the relation between the areas of the parallelograms. (15) Given two nonparallel vectors a and b, show that any vector r in space can be written as a linear combination r =za + yb + za x b and that the numbers x, y, and z are unique for every r. Express z via r, a, and b. In particular, put a = (1, 1, 1) and b = (1, 1, 0). Find the coefficients x, y, and z for r = (1, 2, 3). Hint: See Study Problems 11.6 and 11.14. (16) A tetrahedron is a solid with four vertices and four triangular faces. Let v1, v2, v3, and v4 be vectors with lengths equal to the areas of the faces and directions perpendicular to the faces and pointing outward. Show that v1 + v2 + v3 + v4 = 0. (17) If a - b = a - c and a x b = a x c, does it follow that b = c? (18) Given two nonparallel vectors a and b, construct three mutually orthogonal unit vectors u2, i = 1, 2, 3, one of which is parallel to a. Are such unit vectors unique? In particular, put a = (1, 2, 2) and b =(1, 0, 2) and find 6ii. (19) Let 6ii, i =1, 2, 3, be an orthonormal basis in space with the property that 6i3 =i61 x 112. If ai, a2, and as are the components of vector a relative to this basis and bi, b2, and b3 are the components of b, show that the components of the cross product a x b can also be  75. THE TRIPLE PRODUCT 51 computed by the determinant rule given in Definition 11.12 where e are replaced by i. Hint: Use the bac - cab rule to find all pairwise cross products of the basis vectors u. (20) Let the angle between the rigid rods in Example 11.14 be 0 < c < 7. Find the equilibrium position of the system. (21) Two rigid rods of the same length are rigidly attached to a ball of mass m so that the angle between the rods is r/2. A ball of mass 2m is attached to one of the free ends of the system. The remaining free end is used to hang the system. Find the angle between the rod connecting the pivot point and the ball of mass m and the vertical axis along which the gravitational force is acting. Assume that the masses of the rods can be neglected as compared to m. (22) Three rigid rods of the same length are rigidly joined by one end so that the rods lie in a plane and the other end of each rod is free. Let three balls of masses m1, m2, and m3 be attached to the free ends of the rods. The system is hung by the joining point and can rotate freely about it. Assume that the masses of the rods can be neglected as compared to the masses of the balls. Find the angles between the rods at which the balls remain in a horizontal plane under gravitational forces acting vertically. Do such angles exist for any masses of the balls? 75. The Triple Product DEFINITION 11.13. The triple product of three vectors a, b, and c is a number obtained by the rule: a - (b x c). It follows from the algebraic definition of the cross product and the definition of the determinant of a 3 x 3 matrix that a - (b x c) = ai det b2b3- a2 det1 b3 +a3detb b2 \C2 C3) 1 C3 1 C2/ (a 1a2 a3 =det bi b2 b3)- 01c2 c3 This provides a convenient way to calculate the numerical value of the triple product. If two rows of a matrix are swapped, then its determinant changes sign. Therefore, a -(b x c) =-b (a x c) =-c (b x a). This means, in particular, that the absolute value of the triple product is independent of the order of the vectors in the triple product. Also, the value of the triple product is invariant under cyclic permutations of vectors in it: a - (b x c) =b.- (c x a) =c - (a x b).  52 11. VECTORS AND THE SPACE GEOMETRY bxc a | 'C C b b FIGURE 11.12. Left: Geometrical interpretation of the triple product as the volume of the parallelepiped whose ad- jacent sides are the vectors in the product: h = ||a||cos 0, A = ||b x cl|, V = hA = ||a||||b x cl cos0 = a - (b x c). Right: Test for the coplanarity of three vectors. Three vec- tors are coplanar if and only if their triple product vanishes: a - (b x c) = 0. 75.1. Geometrical Significance of the Triple Product. Suppose that b and c are not parallel (otherwise, b x c = 0). Let 0 be the angle between a and b x c as shown in Figure 11.12 (left panel). If a iLb x c (i.e., w= /2), then the triple product vanishes. Let 0 / w/2. Consider parallelograms whose adjacent sides are pairs of the vectors a, b, and c. They enclose a nonrectangular box whose edges are the vectors a, b, and c. A box with parallelogram faces is called a parallelepiped with adjacent sides a, b, and c. The cross product b x c is orthogonal to the face containing the vectors b and c, whereas A = ||b x cl is the area of this face of the parallelepiped (the area of the parallelogram with adjacent sides b and c). By the geometrical property of the dot product, a - (b x c) = Ala||cos 0. On the other hand, the distance between the two faces parallel to both b and c (or the height of the parallelepiped) is h = ||a| cosO if 0 < w/2 and h= -||a| cosO if 0 > 7/2, or h = ||a||| cos 0l. The volume of the parallelepiped is V = Ah. This leads to the following theorem. THEOREM 11.5. (Geometrical Significance of the Triple Product). The volume V of a parallelepiped whose adjacent sides are the vectors a, b, and c is the absolute value of their triple product: V = la - (b x c)|. Thus, the triple product is a convenient algebraic tool for calculat- ing volumes. Note also that the vectors can be taken in any order in  75. THE TRIPLE PRODUCT 53 the triple product to compute the volume because the triple product only changes its sign when two vectors are swapped in it. EXAMPLE 11.15. Find the volume of a parallelepiped with adjacent sides a = (1, 2, 3), b = (-2, 0, 1), and c = (2, 1, 2). SOLUTION: The expansion of the determinant over the first row yields -2 0 1 b-(axc)=det(123 =-2(4-3)+1(1-4)=-5. Taking the absolute value of the triple product, the volume is obtained, V = | - 5| = 5. The components of the vectors must be given in the same units of length (e.g., meters). Then the volume is 5 cubic meters. Any vector in a plane is a linear combination of two particular vectors in the plane. Vectors that lie in a plane are called coplanar (see Section 73.4). Clearly, any two vectors are always coplanar. However, three nonzero vectors do not generally lie in one plane. None of three non-coplanar vectors can be expressed as a linear combination of the other two vectors. Such vectors are said to be linearly independent. As noted in Section 73.4, any three linearly independent vectors form a basis in space. Simple criteria for three vectors to be either coplanar or linearly independent can be deduced from Theorem 11.5. COROLLARY 11.4. (Criterion for Three Vectors to Be Coplanar). Three vectors are coplanar if and only if their triple product vanishes: a, b, care coplanar <> a(b x c) = 0. Consequently, three nonzero vectors are linearly independent if and only if their triple product does not vanish. Indeed, if the vectors are coplanar (Figure 11.12, right panel), then the cross product of any two vectors must be perpendicular to the plane where the vectors are and therefore the triple product vanishes. If, conversely, the triple product vanishes, then either b x c = 0 or a I b x c. In the former case, b is parallel to c, or c = tb, and hence a always lies in a plane with b and c. In the latter case, all three vectors a, b, and c are perpendicular to b x c and therefore must be in one plane (orthogonal to b x c). In Section 74.3, it was stated that three linearly independent vectors form a basis in space. The linear independence means that none of the vectors is a linear combination of the other two. Geometrically,  54 11. VECTORS AND THE SPACE GEOMETRY this means that the vectors are not coplanar. Therefore, the following simple criterion holds for three vectors to form a basis in space. COROLLARY 11.5. (Basis in Space). Three vectors u1, u2, and u3 are linearly independent and hence form a basis in space if and only if their triple product does not vanish. EXAMPLE 11.16. Determine whether the points A(1, 1, 1), B(2, 0, 2), C(3, 1, -1), and D(0, 2, 3) are in the same plane. SOLUTION: Consider the vectors a A=B = (1, -1,1), b =0AC = (2, 0, 2), and c = AD = (-1, 1, 2). The points in question are in the same plane if and only if the vectors a, b, and c are coplanar, or a - (b x c) = 0. The evaluation of the triple product yields a -(b xc) =det 1 2 -1 -1 1 0 2 = 1(0-2)+1(4+2)+1(2-0) = 6 10). 12 Therefore, the points are not in the same plane. D- EXAMPLE 11.17. Let u= (1,1, -1). Can the vector a = bination of the vectors u1, u2, = (1, 2, 3), u2 = (2, 1, -6), and u3 (1,1,1) be represented as a linear com- and u3? SOLUTION: Any vector in space is a linear combination of u1, u2, and u3 if they form a basis. Let us verify first whether or not they form a basis. By Corollary 11.5, u (u2 x u3) d 12 3 det 2 1 -6 1 1 -1 1(-1+6)-2(-2+6)-+3(2-1) =0. Therefore, these vectors do not form a basis and are coplanar. Note that u3 = (1u1+u2). If the vector a lies in the same plane as the vectors u1, u2, and u3, then it is a linear combination of any two nonparallel vectors, say, ui and u3. Since the following triple product does not vanish, a- (u1 X u3) d11 1 det 1 2 3 1 1 -1 1(-2-3)-1(-1-3)+1(1-2) -2, the vector a is not in the plane in which the vectors ui, u2, and u3 lie, and therefore a is not a linear combination of them. D  75. THE TRIPLE PRODUCT 55 75.2. Right- and Left-Handed Coordinate Systems Consider a rectangular box whose sides are parallel to three given unit vectors e2, i = 1, 2, 3. Any vector in space can be viewed as the diagonal of one such box and therefore is uniquely expanded into the sum r =6zei + ye2 + ze3, where the ordered triple of numbers (x, y, z) is determined by scalar projections of r onto e2. Thus, with any triple of mutually orthogonal unit vectors one can associate a rectangular coordinate system. The vector ei x e2 must be parallel to e3 because the latter is orthogonal to both ei and e2. Furthermore, owing to the orthogonality of ei, and e2, ||ei x e2 = e e2|= 1 and hence ei x e2 = te3. Consequently, e3 (ei x 62) =61(62 xe3) =+1 or, owing to the mutual orthogonality of the vectors, 62 x e3= +61. A coordinate system is called right-handed if ei - (62 x e3) = 1, and a coordinate system for which ei - (62 x e3) = -1 is called left-handed. A right-handed system can be visualized as fol- lows. With the thumb, index, and middle fingers of the right hand at right angles to each other (with the index finger pointed straight), the middle finger points in the direction of ei = 62 x e3 when the thumb represents e2 and the index finger represents es. A left-handed system is obtained by the reflection ei - -ei and therefore is visualized by the fingers of the left hand in the same way. Since the dot product can- not be changed by rotations and translations in space, the handedness of the coordinate system does not change under simultaneous rotations and translations of the triple of vectors e (three mutually orthogonal fingers of the left hand cannot be made pointing in the same direction as the corresponding fingers of the right hand by any rotation of the hand). The reflection of all three vectors e2 - -e2 or just one of them turns a right-handed system into a left-handed one and vice versa. A mirror reflection of a right-handed system is the left-handed one. The coordinate system associated with the standard basis ei = (1,0, 0), e2 = (0, 1, 0), and e3= (0, 0, 1) is right-handed because 51-(52 xe3) = 1. 75.3. Distances Between Lines and Planes. If the lines or planes in space are not intersecting, then how can one find the distance between them? This question can be answered using the geometrical properties of the triple and cross products (Theorems 11.4 and 11.5). Let Si and S2 be two sets of points in space. Let a point A1 belong to S1, let a point A2 belong to S2, and let |A1A2| be the distance between them. DEFINITIoN 11.14. (Distance Between Sets in Space). The distance D between two sets of points in space, S1 and 32, is the largest number that is less than or equal to all the numbers |A1A2| when the point A1 ranges over S1 and the point A2 ranges over S2.  56 11. VECTORS AND THE SPACE GEOMETRY Naturally, if the sets have at least one common point, the distance between them vanishes. The distance between sets may vanish even if the sets have no common points. For example, let Si be an open interval (0, 1) on, say, the x axis, while S2 is the interval (1, 2) on the same axis. Apparently, the sets have no common points (the point x = 1 does not belong to either of them). The distance is the largest number D such that D < Ii - X21, where 0 < x1 < 1 and 1 < X2 < 2. The value of Izi - x2| > 0 can be made smaller than any preassigned positive number by taking x1 and x2 close enough to 1. Since the distance D > 0, the only possible value is D = 0. Intuitively, the sets are separated by a single point that is not an "extended" object, and hence the distance between them should vanish. In other words, there are situations in which the minimum of JA1A2| is not attained for some A1 E 31, or some A2 E S2, or both. Nevertheless, the distance between the sets is still well defined as the largest number that is less than or equal to all numbers |A1A2. Such a number is called the infimum of the set of numbers A1A2 and denoted inf JA1A2|. Thus, D=inf|A1A2|, A1ES1 , A2 ES2. The notation A1 E Si stands for "a point A1 belongs to the set S1," or simply "A1 is an element of 31." The definition is illustrated in Figure 11.13 (left panel). S1 S2'P2 .....D.D..Aa............................... FIGURE 11.13. Left: Distance between two point sets Si and S2 defined as the largest number that is less than or equal to all distances |A1A2, where A1 ranges over all points in Si and A2 ranges over all points in S2. Right: Distance between two parallel planes (Corollary 11.6). Consider a par- allelepiped whose opposite faces lie in the planes P1 and P2. Then the distance D between the planes is the height of the parallelepiped, which can be computed as the ratio D =-V/A, where V = a- (b x c)| is the volume of the parallelepiped and A =|b x c| is the area of the face.  75. THE TRIPLE PRODUCT 57 COROLLARY 11.6. (Distance Between Parallel Planes). The distance between parallel planes Pi and P2 is given by |AP- ( AB x AC)| D = ||ABxAC|| where A, B, and C are any three points in the plane P1 that are not on the same line, and P is any point in the plane P2. PROOF. Since the points A, B, and C are not on the same line, the vectors b = AB and c = AC are not parallel, and their cross product is a vector perpendicular to the planes (see Figure 11.13, right panel). Consider the parallelepiped with adjacent sides a = AP, b, and c. Two of its faces, the parallelograms with adjacent sides b and c, lie in the parallel planes, one in P1 and the other in P2. The distance between the planes is, by construction, the height of the parallelepiped, which is equal to V/Ar, where AP is the area of the face parallel to b and c and V is the volume of the parallelepiped. The conclusion follows from the geometrical properties of the triple and cross products: V =la- (b x c)| and AP = ||b x c l. Similarly, the distance between two parallel lines L1 and £2 can be determined. Two lines are parallel if they are not intersecting and lie in the same plane. Let A and B be any two points on the line .1 and let C be any point on the line £2. Consider the parallelogram with adjacent sides a = AB and b = AC as depicted in Figure 11.14 (left panel). The distance between the lines is the height of this parallelogram, which is D= Ap/||a||, where A= =I|a x b|| is the area of the parallelogram and la|| is the length of its base. COROLLARY 11.7. (Distance Between Parallel Lines). The distance between two parallel lines G1 and £2 is ||IAB xAC|| ||AB|| where A and B are any two distinct points on the line G1 and C is any point on the line £2. DEFINITION 11.15. (Skew Lines). Two lines that are not intersecting and not parallel are called skew lines. To determine the distance between skew lines £1 and £2, consider any two points A and B on £1 and any two points C and P on £2.  58 11. VECTORS AND THE SPACE GEOMETRY a A b B FIGURE 11.14. Left: Distance between two parallel lines. Consider a parallelogram whose two parallel sides lie in the lines. Then the distance between the lines is the height of the parallelogram (Corollary 11.7). Right: Distance between skew lines. Consider a parallelepiped whose two nonparallel edges AB and CP in the opposite faces lie in the skew lines l and E2, respectively. Then the distance between the lines is the height of the parallelepiped, which can be computed as the ratio of the volume and the area of the face (Corol- lary 11.8). Define the vectors b = AB and c = CP that are parallel to lines L1 and £2, respectively. Since the lines are not parallel, the cross product b x c does not vanish. The lines L1 and £2 lie in the parallel planes perpendicular to b x c (by the geometrical properties of the cross prod- uct, b x c is perpendicular to b and c). The distance between the lines coincides with the distance between these parallel planes. Consider the parallelepiped with adjacent sides a = AC, b, and c as shown in Figure 11.14 (right panel). The lines lie in the parallel planes that contain the faces of the parallelepiped parallel to the vectors b and c. Thus, the distance between skew lines is the distance between the parallel planes containing them. By Corollary 11.6, this distance is D = V/Ap, where V and Ap = ||b x c are, respectively, the volume of the parallelepiped and the area of its base. COROLLARY 11.8. (Distance Between Skew Lines). The distance between two skew lines L1 and £2 is AC . ( AB x CP)| D- AB x CP where A and B are any two distinct points on C1, while C and P are any two distinct points on £2.  75. THE TRIPLE PRODUCT 59 As a consequence of the obtained distance formulas, the following criterion for mutual orientation of two lines in space holds. COROLLARY 11.9. Let G1 be a line through A and B A, and let £2 be a line through C and P / C. Then (1) L1 and £2 are parallel if AB x CP = 0 (2) G1 and £2 are skew ifAC - (AB x CP) 0 (3) G1 and£G2 intersect if AC - (AB x CP) = 0 and AB x CP#0 (4) G1 and£G2 coincide if AC - (AB x CP) = 0 and AB x CP = 0 For parallel lines G1 and £2, the vectors AB and CP are parallel, and hence their cross product vanishes. If AC.- (AB x CP) 0, then the lines are not parallel and lie in the parallel faces of a parallelepiped that has nonzero volume. Such lines must be skew. If AB x CP 0, the lines are not parallel. The additional condition AC - (AB x CP) = 0 implies that the distance between them vanishes, and hence the lines must intersect at a point. The conditions AC.- (AB x CP) = 0 and AB x CP = 0 imply that the lines are parallel and intersecting. So they must coincide. EXAMPLE 11.18. Find the distance between the line through the points A = (1, 1, 2) and B = (1,2,3) and the line through C = (1, 0, -1) and P = (-1, 1, 2). SOLUTION: Let a = AB = (0,1,1) and b = CP = (-2, 1, 3). Then a x b = (3 - 1, -(0 + 2), 0 + 2) = (2, -2, 2)/ 0. So the lines are not parallel by property (1) in Corollary 11.9. Put c = AC = (0, -1, -3). Then c - (a x b) = (0, -1, -2)"- (2, -2, 2) = 0 + 2 - 4 = -2 / 0 By property (2) in Corollary 11.9, the lines are skew. Next, ||a x b|= (2, -2, 2)|| = ||2(1, -1, 1)|| = 2||(1, -1, 1)|| = 293. By Corollary 11.8, the distance between the lines is D | c.(axb)| | -2 1 la x b 2/ v/ 75.4. Study Problems. Problem 11.20. (Rotations in Space). Let a =(a1, a2, a3) and a' =(a'1, a's, a's) be position vectors of a point relative two coordinate systems related to one another by a rotation. As  60 11. VECTORS AND THE SPACE GEOMETRY noted in Section 74. 1, the coordinates a' and ai are related by a linear transformation that preserves the length, a'=vi1a1 +vi2a2+vi3a3, i=1,2,3, |a|| = ||a'l and excludes the reflections of all coordinate axes or just one of them. So a rotation is described by a 3 x 3 matrix V with elements vii. The vectors vi = (vii,vi2,vi3) and wi = (vi,v2,v3), i = 1, 2, 3, are the rows and columns of V, respectively. Show that the rows of V are mutually orthonormal, the columns of V are also mutually orthonormal, and the determinant of V is unit: (11.12) vi-vj =wi-wj = 0 if and detV =1. In particular, show that the direct and inverse transformations of the coordinates under a rotation in space are: (11.13) a' = vi - a and ai = wi - a'. How many independent parameters can the matrix V have for a generic rotation in space? Hint: If ni and ei, i = 1, 2, 3, are orthonormal unit vectors (bases) associated with the rotated and original coordinate systems, show that the rows of V determine the components of i relative to the basis ei, whereas the columns of V determine the components of ei relative to the basis ni. Use this observation to show that ni ni = vi - v, ei - e* = wi - wj, and (11.14) i - (n2 x n3) =detV ei (e2 xe3). SOLUTION: A vector a can be expanded into the sum of mutually orthogonal vectors in each of the coordinate systems a = ai1e + a282 + a3e3 = a'ili + a202 + a'303. Note that a|| =I||a'll by the orthonormality of the bases ei and ni. Let us multiply both sides of this equality by i. Put vii = ni63eg. Then a i -a = eia1 + i - 822 + ni - e3a3 = vi1ai + vi2a2 + vi3a3 = vi - a. Thus, the components of the rotation matrix V are the dot products vig ni-6e. For a fixed i, the numbers vjg are scalar projections of ui onto eg, j =1, 2, 3, and hence are components of i relative to the basis ej, that is, i= veiei + vi282 + vises. It is then concluded that  75. THE TRIPLE PRODUCT 61 the ith row of V coincides with the components of i relative to the basis e6. Similarly, a3 = e" - a=eg - 6lal + e -2a2 + e63a' = vi a' + v2a2 + v3ja3=w- a'. For a fixed j, the numbers vij are scalar projections of ej onto i, i = 1, 2, 3, and hence are components of ej relative to the basis ni, that is, e6 = vij61 + v262 + v33. Thus, the jth column of V coincides with the components of e6 relative to the basis 6^2. This proves (11.13). Making use of the expansion of 62 in the orthonormal basis e6 and of the expansion of e in the basis ni, one obtains 2 i n = viiv1 + vi2vJ2 + vi3VJ3 = v - V3, 8, - 8 = viivig + v2iv2j + v3iv3 = wi- w3. The first relation in (11.12) follows from the orthonormality of the basis vectors i and the basis vectors e. Next, consider the cross product n2 X U = (v21i + v22e2 + v23e3) x (v31i + v32e2 + v33e3) = det Vn1(62 x e3) - det V12(63 x 61) + det V13(ei x 62), (V22 V23 ,v21 V23 3 v22 v22 V23 V33 /)\1V31vV33 / 13 v31 V32J where the skew symmetry of the cross product ei x e6 = -e x ei and the definition of the determinant of a 2 x 2 matrix have been used; the matrices Vii, i= 1, 2, 3, are obtained by removing from V the row and column that contain vii. Using the symmetry of the triple product under cyclic permutations of the vectors, one has 61 - (62 x 63) = (V11 det Vn1 - v12 det V12 + v13 det V13)ei- (62 x 63). Equation (11.14) follows from this relation and the definition of the determinant of a 3 x 3 matrix. Now recall that the handedness of a coordinate system is preserved under rotations, i -(n2 x n3) ei - (82 x e3) =+1, and therefore det V = 1. It is also worth noting that a combination of rotations and reflections is described by matrices V whose rows and columns are orthonormal, but det V =+1. The handedness of a coordinate system is changed if det V = -1. The vector 613 is determined by its three directional angles in the original coordinate system. Only two of these angles are independent. A rotation about the axis containing the vector 113 does not affect 113 and can be specified by a rotation angle in a plane perpendicular to the axis. This angle determines the vectors i and n12 relative to the  62 11. VECTORS AND THE SPACE GEOMETRY original basis. So a general rotation matrix V has three independent parameters. In particular, the matrix V for counterclockwise rotations about the z axis through an angle # (see Study Problem 11.2) is cos #5sin #50 V=(-sin#$cos#0) 0 0 1 Relations (11.12) for V and its rows and columns vi = (cos #, sin#,0), wi = (cos #,-sin#,0), v2 = (- sin #, cos #,O0), w2 = (sin#, cos #, 0), V3 = (0, 0, 1), w3=(0,0,1) are easy to verify. The result of Study Problem 11.2 can be stated in the form (11.13), where a = (x, y, z) and a' = (x', y', z'). Q Problem 11.21. Find the most general vector r that satisfies the equa- tion a - (r x b) = 0, where a and b are nonzero, nonparallel vectors. SOLUTION: By the algebraic property of the triple product, a- (r x b) r -(b x a) = 0. Hence, r _L ax b. The vector r lies in the plane parallel to both a and b because a x b is orthogonal to these vectors. Any vector in the plane is a linear combination of any two nonparallel vectors in it: r = ta + sb for any real t and s (see Study Problem 11.6). Q Problem 11.22. (Volume of a Tetrahedron). A tetrahedron is a solid with four vertices and four triangular faces. Its volume V = {Ah, where h is the distance from a vertex to the opposite face and A is the area of that face. Given coordinates of the vertices B, C, D, and P, express the volume of the tetrahedron through them. SOLUTION: Put b = BC, c = BD, and a = AP. The area of the triangle BCD is A = }||b x cl. The distance from P to the plane P1 containing the face BCD is the distance between J1 and the parallel plane P2 through the vertex P. Hence, V=-A a(bc) 6la'-(b x c)|. 3 |bx c| 6 So the volume of a tetrahedron with adjacent sides a, b, and e is one- sixth the volume of the parallelepiped with the same adjacent sides. Note the result does not depend on the choice of a vertex. Any vertex could have been chosen instead of B in the above solution. D  75. THE TRIPLE PRODUCT 63 Problem 11.23. (Systems of Linear Equations). Consider a system of linear equations for the variables x, y, and z: a1x + biy + cz=d a2x + b2y + c2z=d2 a3x + b3y + c3z=d3 Define vectors a = (ai, a2, a3), b = (b1, b2, b3), c (ci, c2, c3), and d = (d1, d2, d3). Show that the system has a unique solution for any d if a - (b x c) / 0. If a - (b x c) = 0, formulate conditions on d under which the system has a solution. SOLUTION: The system of linear equations can be cast in the vector form xa+yb+zc =d. This equation states that a given vector d is a linear combination of three given vectors. In Study Problem 11.11, it was demonstrated that any vector in space can be uniquely represented as a linear combination of three non-coplanar vectors. So, by Corollary 11.4, the numbers x, y, and z exist and are unique if a - (b x c) 740. When a - (b x c) = 0, the vectors a, b, and c lie in one plane. If d is not in this plane, the system cannot have a solution because d cannot be represented as a linear combination of vectors in this plane. Suppose that two of the vectors a, b, and c are not parallel. Then their cross product is orthogonal to the plane, and d must be orthogonal to the cross product in order to be in the plane. If, say, a x b 7 0, then the system has a solution if d - (a x b) = 0. Finally, it is possible that all the vectors a, b, and c are parallel; that is, all pairwise cross products vanish. Then d must be parallel to them. If, say, a 7 0, then the system has a solution if d x a= 0. 75.5. Exercises. (1) Find the triple products a - (b x c), b - (a x c), and c - (a x b) if (i) a = (1, -1, 2), b = (2, 1, 2), and c = (2, 1, 3) (ii) a = ui + 2u2, b = u1 - u2 + 2u3, and c = u2 - 313 if ui- (u2 x u3) =2 (2) Verify whether the vectors a =681 + 262 - e3, b =g26 - e2 + e3, and c - 381 + 82 - 283 are coplanar. (3) Find the value of s, if any, for which the vectors a =(1, 2, 3), b - (-1, 0, 1), and c =(s, 1, 2s) are coplanar. (4) Let a =(1, 2, 3), b =(2, 1, 0), and c =(3, 0, 1). Find the volume of the parallelepiped with adjacent sides sa + b, c - tb, and a - pc,  64 11. VECTORS AND THE SPACE GEOMETRY where s, t, and p are numbers such that stp = 1. (5) Let the numbers u, v, and w be such that uvw = 1 and u3+ v3 + w3 = 1. Are the vectors a =ttei + ve2 + w83, b = vei + we2 + 83, and c = wei + t82 + ve3 coplanar? If not, what is the volume of the parallelepiped with adjacent edges a, b, and c? (6) Determine whether the points A = (1, 2, 3), B = (1, 0, 1), C (-1, 1, 2), and D = (-2, 1, 0) are in one plane and, if not, find the volume of the parallelepiped with adjacent edges AB, AC, and AD. (7) Find: (i) All values of s at which the points A(s, 0, s), B(1, 0, 1), C(s, s, 1), and D(0, 1, 0) are in the same plane (ii) All values of s at which the volume of the parallelepiped with adjacent edges AB, AC, and AD is 9 units (8) Prove that (axb). (cx d) =det Kad b~d Hint: Use the invariance of the triple product under cyclic permuta- tions of vectors in it and the bac - cab rule (11.9). (9) Let P be a parallelepiped of volume V. Find: (i) The volumes of all parallelepipeds whose adjacent edges are diagonals of the adjacent faces of P (ii) The volumes of all parallelepipeds whose two adjacent edges are diagonals of two nonparallel faces of P, while the third adjacent edge is the diagonal of P (10) Given two nonparallel vectors a and b, find the most general vector r if (i) a - (r x b) = 0 (ii) a - (r x b) =0 and b - r = 0 (11) Let a set Si be the circle x2+ y2 = 1 and let a set S2 be the line through the points (0, 2) and (2, 0). What is the distance between the sets Si and 32? (12) Consider a plane through three points A = (1, 2, 3), B = (2, 3, 1), and C = (3, 1, 2). Find the distance between the plane and a point P obtained from A by moving the latter 3 units of length along the vector a= (-1, 2,2). (13) Consider two lines. The first line passes through the points (1, 2, 3) and (2, -1, 1), while the other passes through the points (-1, 3, 1) and (1, 1, 3). Find the distance between the lines. (14) Find the distance between the line through the points (1, 2, 3) and (2, 1, 4) and the plane through the points (1, 1, 1), (3, 1, 2), and  76. PLANES IN SPACE 65 76. PLANES IN SPACE 65 (1, 2, -1). Hint: If the line is not parallel to the plane, then they in- tersect and the distance is 0. So check first whether the line is parallel to the plane. How can this be done? (15) Consider the line through the points (1, 2, 3) and (2, 1, 2). If a second line passes through the points (1, 1, s) and (2, -1, 0), find all values of s, if any, at which the distance between the lines is 9/2 units. (16) Consider two parallel straight line segments in space. Formulate an algorithm to compute the distance between them if the coordinates of their endpoints are given. In particular, find the distance between AB and CD if (i) A = (1, 1, 1), B = (4, 1, 5), C = (2, 3, 3), and D = (5, 3, 7) (ii) A = (1, 1, 1), B = (4, 1, 5), C = (3, 5, 5), and D = (6, 5, 9) Note that this distance does not generally coincide with the distance between the parallel lines containing AB and CD. (17) Consider the parallelepiped with adjacent edges AB, AC, and AD, where A = (3, 0, 1), B = (-1, 2, 5), C = (5,1, -1), and D = (0, 4, 2). Find the distances (i) Between the edge AB and all other edges parallel to it (ii) Between the edge AC and all other edges parallel to it (iii) Between the edge AD and all other edges parallel to it (iv) Between all parallel planes containing the faces of the paral- lelepiped 76. Planes in Space 76.1. A Geometrical Description of a Plane in Space. Consider the co- ordinate plane z = 0. It contains the origin and all vectors that are orthogonal to the z axis (all vectors that are orthogonal to 63). Since the coordinate system can be arbitrarily chosen by translating the ori- gin and rotating the coordinate axes, a plane in space is defined as a set of points whose position vectors relative to a particular point in the set are orthogonal to a given nonzero vector n. The vector n is called a normal of the plane. Thus, the geometrical description of a plane P in space entails specifying a point Po that belongs to P and a normal nof P. 76.2. An Algebraic Description of a Plane in Space. Let a plane P be defined by a point Po that belongs to it and a normal n. In some coordinate system, the point Po has coordinates (zco, yo, z0) and the vector n is specified by its components n =K(i, in2, ns3). A generic point in space P has coordinates (xc, y, z). An algebraic description of a plane amounts to specifying conditions on the variables (xc, y, z) such  66 11. VECTORS AND THE SPACE GEOMETRY that the point P(x, y, z) belongs to the plane P. Let ro = (zo, yo, zo) and r = (x, y, z) be the position vectors of a particular point Po in the plane and a generic point P in space, respectively. Then the position vector of P relative to Po is PoP = r -ro= (x-zo, y - yo, z - zo). This vector lies in the plane P if it is orthogonal to the normal n, according to the geometrical description of a plane (see Figure 11.15, left panel). The algebraic condition equivalent to the geometrical one, n i PoP, reads n - PoP = 0. Thus, the following theorem has just been proved. THEOREM 11.6. (Equation of a Plane). A point with coordinates (x, y, z) belongs to a plane through a point Po(xo, yo, zo) and normal to a vector n = (n1, n2, n3) if ni(x -xo) +n2(y -yo) +in(z-zo)=0 or n-r=n-ro, where r and ro are position vectors of a generic point and a particular point Po in the plane. Given a nonzero vector n and a number d, it is always possible to find a particular vector ro such that n - ro = d. Since at least one component of n does not vanish, say, n'i / 0, then ro = (d/ni, 0, 0). Therefore, a general solution of the linear equation n - r = d is a set of position vectors of all points of a plane that is orthogonal to n. The number d determines the position of the plane in space in the following way. Suppose that every point of the plane is displaced by a vector a, that is, r - r + a. The equation of the displaced plane is n- (r + a) = d or n-r = d-n-a. If n-a = 0, each point of the plane is translated within the plane because a is orthogonal to n. The plane as a point set does not change and neither does the number d. If the displacement vector a is not orthogonal to n, then d changes by the amount -n - a / 0. Since every point of the original plane is translated by the same vector, the result of this transformation is a parallel plane. Variations of d correspond to shifts of the plane parallel to itself along its normal (see Figure 11.15, right panel). Thus, the equations n - r = di and n - r = d2 describe two parallel planes. The planes coincide if and only if di = d2. Consequently, two planes ni - r = di and n2 - r = d2 are parallel if and only if their normals are proportional or if and only if their normals are parallel: for some real s / 0. For example, the planes xc - 2y - z - 5 and -2xc+ 4y + 2z - 1 are parallel because their normals, ni - (1, -2, -1) and n2 =(-2, 4, 2), are proportional n2 =-2ni.  76. PLANES IN SPACE 67 76. PLANES IN SPACE 67 FIGURE 11.15. Left: Algebraic description of a plane. If ro is a position vector of a particular point in the plane and r is the position vector of a generic point in the plane, then the vector r - ro lies in the plane and is orthogonal to its normal, that is, n - (r - ro) = 0. Right: Equations of parallel planes differ only by their constant terms. The difference of the constant terms determines the distance between the planes as stated in (11.17). A normal to a given plane can always be obtained by taking the cross product of any two nonparallel vectors in the plane. Indeed, any vector in a plane is a linear combination of two nonparallel vectors a and b (Study Problem 11.6). The vector n = a x b is orthogonal to both a and b and hence to any linear combination of them. EXAMPLE 11.19. Find an equation of the plane through three given points A(1, 1, 1), B(2, 3, 0), and C(-1, 0, 3). SOLUTION: A plane is specified by a particular point Po in it and by a vector n normal to it. Three points in the plane are given, so any of them can be taken as Po, for example, Po = A or (xo, yo, zo) (1, 1, 1). A vector normal to a plane can be found as the cross product of any two nonparallel vectors in that plane (see Figure 11.16, left panel). So put a = AB = (1, 2, -1) and b = AC = (-2, -1, 2). Then one can take n = a x b = (3, 0, -3). An equation of the plane is 3(x - 1) +0(y - 1)+ (-3)(z - 1) = 0, or x - z = 0. Since the equation does not contain the variable y, the plane is parallel to the y axis. Note that if the y component of n vanishes (i.e., there is no y in the equation), then n is orthogonal to 62 because n .82 = 0; that is, the y axis is orthogonal to n and hence parallel to the plane. Q  68 11. VECTORS AND THE SPACE GEOMETRY n~axb p C B n D B B A a " FIGURE 11.16. Left: Illustration to Example 11.19. The cross product of two nonparallel vectors in a plane is a nor- mal of the plane. Right: Distance between a point Pi and a plane. An illustration to the derivation of the distance formula (11.15). The segment P1B is parallel to the nor- mal n so that the triangle PoP1B is right-angled. Therefore, D = P1B| =|PoP1| cos0. DEFINITION 11.16. (Angle Between Two Planes). The angle between the normals of two planes is called the angle between the planes. If ni and n2 are the normals, then the angle 0 between them is determined by cos 6 -ni - n2 -n 2 cosO = l 1211 I1||||n2|| Note that a plane as a point set in space is not changed if the direction of its normal is reversed (i.e., n -- -n). So the range of 0 can always be restricted to the interval [0, 7/2]. Indeed, if 0 happens to be in the interval [7/2,w7] (i.e., cosOB < 0), then the angle 0 - 7/2 can also be viewed as the angle between the planes because one can always reverse the direction of one of the normals ni - -ni or n2 - -n2 so that cos 0 - -cos 0. The angle between the planes is useful for determining their relative orientation. Two planes intersect if the angle between them is not 0. Two planes are parallel if the angle between them vanishes. The planes are perpendicular if their normals are orthogonal. For example, the planes x + y + z = 1 and x + 2y - 3z = 4 are perpendicular because their normals ni = (1, 1, 1) and n2 = (1, 2, -3) are orthogonal: ni-n2 1 + 2 - 3 = 0 (i.e., n111n2). 76.3. The Distance Between a Point and a Plane. Consider the plane through a point Po and normal to a vector n1. Let P1 be a point in space. What is the distance between P1 and the plane? Let the angle between n1 and the vector PoP1 be 0 (see Figure 11.16, right panel).  76. PLANES IN SPACE 69 76. PLANES IN SPACE 69 Then the distance in question is D =|PoP cos 0 if 0 < 7/2 (the length of the straight line segment connecting P1 and the plane along the normal n). For 0 > 7/2, cos 0 must be replaced by - cos 0 because D > 0. So (11.15) D =|PoP ||cosO -| nPoP1 cosO1 n-PoPi Let ro and r1 be position vectors of Po and P1, respectively. Then PoP1 = r1 - ro, and (11.16) D= ni-r)_ nr- which is a bit more convenient than (11.15) if the plane is defined by an equation n - r = d. Distance Between Parallel Planes. Equation (11.16) allows us to obtain a simple formula for the distance between two parallel planes defined by the equations n -r = di and n -r = d2 (see Figure 11.15, right panel): (11.17) D = . InI| Indeed, the distance between two parallel planes is the distance between the first plane and a point r2 in the second plane. By (11.16), this distance is D= =in - r2 - di/lln n=d2- dl because n - r2 = d2 for any point in the second plane. EXAMPLE 11.20. Find an equation of a plane that is parallel to the plane 2x - y + 2z = 2 and at a distance of 3 units from it. SOLUTION: There are a few ways to solve this problem. From the geometrical point of view, a plane is defined by a particular point in it and its normal. Since the planes are parallel, they must have the same normal n = (2, -1, 2). Note that the coefficients at the variables in the plane equation define the components of the normal vector. Therefore, the problem is reduced to finding a particular point. Let P0 be a particular point on the given plane. Then a point on a parallel plane can be obtained from it by shifting Po by a distance of 3 units along the normal n. If ro is the position vector of Po, then a point on a parallel plane has a position vector ro + sn, where the displacement vector sn must have alength of 3,or |sn| s s|n| =3|s| = 3and therefore s =+1. Naturally, there should be two planes parallel to the given one and at the same distance from it. To find a particular point on the given plane, one can set two coordinates to 0 and find  70 11. VECTORS AND THE SPACE GEOMETRY the value of the third coordinate from the equation of the plane. Take, for instance, Po(1, 0, 0). Particular points on the parallel planes are ro + n = (1, 0, 0) + (2, -1, 2) = (3, -1, 2) and, similarly, ro - n (-1, 1, -2). Using these points in the standard equation of a plane, the equations of two parallel planes are obtained: 2x - y +2z = 11 and 2x - y +2z = -7. An alternative algebraic solution is based on the distance formula (11.17) for parallel planes. An equation of a plane parallel to the given one should have the form 2x - y + 2z = d. The number d is determined by the condition that |d - 2|/lln|| = 3 or |d - 2| 1=9, or d = +9 +2. D 76.4. Study Problems. Problem 11.24. Find an equation of the plane that is normal to a straight line segment AB and bisects it if A = (1, 1,1) and B (-1, 3, 5). SOLUTION: One has to find a particular point in the plane and its normal. Since AB is perpendicular to the plane, n = AB = (-2, 2, 4). The midpoint of the segment lies in the plane. Hence, PO(0, 2, 3) (the coordinates of the midpoints are the half-sums of the corresponding coordinates of the endpoints). The equation reads -2x + 2(y - 2) + 4(z - 3) = 0 or -x+ y + 2z = 8. D Problem 11.25. Find an equation of the plane through the point Po(1, 2, 3) that is perpendicular to the planes x + y + z= 1 and z - y + 2z = 1. SOLUTION: One has to find a particular point in the plane and any vector orthogonal to it. The first part of the problem is easy to solve: Po is given. Let n be a normal of the plane in question. Then, from the geometrical description of a plane, it follows that 1nni= (1, 1, 1) and nin2 = (1, -1, 2), where ni and n2 are normals of the given planes. So n is a vector orthogonal to two given vectors. By the geometrical property of the cross product, such a vector can be constructed as n = ni x n2 = (3, -1, -2). Hence, the equation reads 3(x - 1) - (y - 2)- 2(z -3) 0 or 3x - y - 2z =-5. D Problem 11.26. Determine whether two planes x + 2y - 2z = 1 and 2cc + 4y + -4z 10 are parallel and, if not, find the angle between them. SOLUTION: The normals are ni =(1, 2, -2) and n2 =(2, 4,4)= 2(1, 2, 2). They are not proportional. Hence, the planes are not parallel. Since ||1|i 3, ||2|= 6, and n1i .n2 =2, the angle is determined by cosO6 2/18 =1/9 or 0 cos-1(1/9).D  76. PLANES IN SPACE 71 76. PLANES IN SPACE 71 Problem 11.27. Find a family of all planes that contains the straight line segment AB if A = (1, 2, -1) and B = (2, 4, 1). SOLUTION: All the planes in question contain the point A. So it can be chosen as a particular point in every plane. Since the segment AB lies in every plane of the family, the question amounts to describing all vectors orthogonal to a = AB = (1, 2, 2) that determine the normals of the planes in the family. It is easy to find a particular vector orthogonal to a. For example, b = (0, 1, -1) is orthogonal to a because a - b = 0. Next, the vector a x b = (-4, 1, 1) is orthogonal to both a and b. Any vector orthogonal to a lies in a plane orthogonal to a and hence must be a linear combination of any two nonparallel vectors in this plane. So the sought-after normals are all linear combinations of b and c = a x b. Since the length of each normal is irrelevant, the family of the planes is described by all unit vectors orthogonal to a. Recall that any unit vector in a plane can be written in the form no = cos O61 + sinOn2, where i1,2 are two unit orthogonal vectors in the plane and 0 < 0 < 27 (see Figure 11.5, right panel). So put 61 = b = b/||b|| and 62 =c6 c/||c l, where ||b|| 2 and ||c| = 3v2. The family of the planes is described by equations n - (r - ro) = 0, where 0 < 0 < 27 and ro = (1, 2, -1) (the position vector of A). D 76.5. Exercises. (1) Find an equation of the plane through the origin and parallel to the plane 2x - 2y + z = 4. What is the distance between the two planes? (2) Do the planes 2x + y - z = 1 and 4x + 2y - 2z = 10 intersect? (3) Determine whether the planes 2x + y - z = 3 and x + y + z 1 are intersecting. If they are, find the angle between them. (4) Consider a parallelepiped with one vertex at the origin 0 at which the adjacent sides are the vectors a = (1, 2, 3), b = (2, 1,1), and c (-1, 0, 1). Let OP be the largest diagonal of the parallelepiped. Find an equation of the planes that contain: (i) The faces of the parallelepiped (ii) The largest diagonal of the parallelepiped and the diagonal of each of three of its faces adjacent at P (iii) Parallel diagonals in the opposite faces of the parallelepiped (5) Find an equation of the plane with cc intercept a, y intercept b, and z intercept c. What is the distance between the origin and the plane? (6) Find equations of the planes that are perpendicular to the line through (1, -1, 1) and (2, 0, 1) and that are at the distance 2 from the point (1, 2, 3).  72 11. VECTORS AND THE SPACE GEOMETRY (7) Find an equation for the set of points that are equidistant from the points (1, 2, 3) and (-1, 2, 1). Give a geometrical description of the set. (8) Find an equation of the plane that is perpendicular to the plane x + y + z = 1 and contains the line through the points (1, 2, 3) and (-1, 1,0). (9) To which of the planes x + y + z = 1 and x + 2y - z = 2 is the point (1, 2, 3) the closest? (10) Give a geometrical description of the following families of planes: (i) x + y + z =c (ii) x + y + cz = 1 (iii) x sin c+ y cos c+ z 1 where c is a parameter. (11) Find values of c for which the plane x + y + cz = 1 is closest to the point P(1, 2, 1) and farthest from P. (12) Consider three planes with normals ni, n2, and n3 such that each pair of the planes is intersecting. Under what condition on the normals are the three lines of intersection parallel or even coincide? (13) Find equations of all the planes that are perpendicular to the plane x + y + z = 1, have the angle r/3 with the plane x + y = 1, and pass through the point (1, 1, 1). (14) Let a = (1, 2, 3) and b = (1, 0, -1). Find an equation of the plane that contains the point (1, 2, -1), the vector a, and a vector orthogonal to both a and b. (15) Consider the plane P through three points A(1, 1, 1), B(2, 0, 1), and C(-1, 3, 2). Find all the planes that contain the segment AB and have the angle r/3 with the plane P. Hint: See Study Problem 11.27. (16) Find an equation of the plane that contains the line through (1, 2, 3) and (2, 1, 1) and cuts the sphere x2 +y2+z2 -2x+4y -6z = 0 into two hemispheres. (17) Find an equation of the plane that is tangent to the sphere x2 + y2+ z2 -2x - 4y - 6z+11 = 0 at the point (2, 1, 2). Hint: What is the angle between a line tangent to a circle at a point P and the segment OP where 0 is the center of the circle? Extend this observation to a plane tangent to a sphere to determine a normal of the tangent plane. (18) Consider a sphere of radius R centered at the origin and two points P1 and P2 whose position vectors are r1 and r2. Suppose that |r1| > R and |r2| > R (the points are outside the sphere). Find the equation n- r =d of the plane through P1 and P2 whose distance from the sphere is maximal. What is the distance? Hint: Show first that a normal of  77. LINES IN SPACE 73 77. LINES IN SPACE 73 the plane can always be written in the form n = r1+ c(r2 - ri). Then find a condition to determine the constant c. 77. Lines in Space 77.1. A Geometrical Description of a Line in Space. Consider the line that coincides with a coordinate axis of a rectangular system, say, the x axis. Any point on it has the characteristic property that its position vector is proportional to the position vector of a particular point (e.g., to ei). Since the coordinate system can be arbitrarily chosen by trans- lating the origin and rotating the coordinate axes, a line in space is defined as a set of points whose position vectors relative to a particular point in the set are parallel to a given nonzero vector v. Thus, the geometrical description of a line C in space entails specifying a point Po that belongs to £ and a vector v that is parallel to C. Remark. Consider two points in space. They can be connected by a path. Among all the continuous paths that connect the two points, there is a distinct one, namely, the one that has the smallest length. This path is called a straight line segment. A line in space can also be defined as a set of points in space such that the shortest path con- necting any pair of points of the set belongs to it. This definition of the line is deeply rooted in the very structure of space itself. How can a line be realized in the space in which we live? One can use a piece of rope, as in the ancient world, or the "line of sight" (i.e., the path traveled by light from one point to another). Einstein's theory of gravity states that "straight lines" defined as trajectories traversed by light are not exactly the same as "straight lines" in a Euclidean space. So a Euclidean space may only be viewed as a mathematical approximation (or model) of our space. A good analogy would be to compare the shortest paths in a plane and on the surface of a sphere; they are not the same, as the latter are segments of circles and hence are "bent" or "curved." The concept of curvature of a path is discussed in the next chapter. The shortest path between two points in a space is called a geodesic (by analogy with the shortest path on the surface of the Earth). The geodesics of a Euclidean space are straight lines and do not have curvature, whereas the geodesics of our space (i.e., the paths traversed by light) do have curvature that is determined by the distribution of gravitating masses (planets, stars, etc.). A deviation of the geodesics from straight lines near the surface of the Earth is very hard to notice. However, a deviation of the trajectory of light from a straight line has been observed for the light coming from a distant  74 11. VECTORS AND THE SPACE GEOMETRY star to the Earth and passing near the Sun. Einstein's theory of gen- eral relativity asserts that a better model of our space is a Riemann space. A sufficiently small neighborhood in a Riemann space looks like a portion of a Euclidean space. 77.2. An Algebraic Description of a Line. In some coordinate systems, a particular point of a line C has coordinates Po(xo, yo, zo), and a vec- tor parallel to C is defined by its components, v = (vi, v2, v3). Let r = (x, y, z) be a position vector of a generic point of C and let ro = (xo, yo, zo) be the position vector of Po. Then the vector r - ro is the position vector of P relative to Po. By the geometrical description of the line, it must be parallel to v. Since any two parallel vectors are proportional, a point (x, y, z) belongs to C if and only if r - ro = tv for some real t (see Figure 11.17, left panel). v r -ro = tv £ ro r D P0 v FIGURE 11.17. Left: Algebraic description of a line C through ro and parallel to a vector v. If ro and r are po- sition vectors of particular and generic points of the line, then the vector r - ro is parallel to the line and hence must be proportional to a vector v, that is, r - ro = tv for some real number t. Right: Distance between a point P1 and a line C through a point Po and parallel to a vector v. It is the height of the parallelogram whose adjacent sides are the vectors PoP1 and v. THEOREM 11.7. (Equations of a Line). The coordinates of the points of the line C through a point Po(xo, yo, zo) and parallel to a vector v = (v, v2, v3) satisfy the vector equation (11.18) r=ro+tv, -oo vin <--> v-n=0. If a line and a plane are not parallel, they must intersect. In this case, there should exist a particular value of the parameter t for which the position vector rt = ro+tv of a point of £ also satisfies an equation of the plane r - n = d (see Figure 11.18, right panel). The value of the parameter t that corresponds to the point of intersection is determined by the equation rt-n=d -> ro-n+tv-n=d d - ro -n The position vector of the point of intersection is found by substituting this value of t into the vector equation of the line rt = ro + tv. EXAMPLE 11.23. A point object is traveling along the line x - 1 y/2 = (z + 1)/2 with a constant speed v = 6 meters per second. If all coordinates are measured in meters and the initial position vector of the object is ro = (1,0, -1), when does it reach the plane 2x + y + z = 13? What is the distance traveled by the object? SOLUTION: Parametric equations of the line are x = 1 + s, y = 2s, z = -1 + 2s. The value of the parameter s at which the line intersects and the plane is determined by the substitution of these equations into the equation of the plane: 2(1+ s) + 2s + (-1+ 2s) = 13 6s = 12 s = 2. So the position vector of the point of intersection is r = (3, 4, 3). The distance between it and the initial point is D = ||r - ro ll =|(2, 4, 4)||1= |2(1, 2, 2)|= 6 meters and the travel time is T = D/v = 1 sec. Q Remark. In this example, the parameter s does not coincide with the physical time. If an object travels with a constant speed v along the line through ro and parallel to a unit vector v, then its velocity vector is v =vY and its position vector is r =ro + vt, where t is the physical time. Indeed, the vector r - ro is the displacement vector of the object along its trajectory, and hence its length determines the distance traveled by the object: |r - rol =|vt|= vt, which shows that the parameter t > 0 is the travel time.  77. LINES IN SPACE 79 77. LINES IN SPACE 79 EXAMPLE 11.24. Find an equation of the plane P that is per- pendicular to the plane PI, x + y - z = 1, and contains the line X - 1= y/2= z+1. SoLUTIoN: The plane P must be parallel to the line (P contains it) and the normal ni = (1, 1, -1) of P1 (as PIP1). So the normal n of P is orthogonal to both ni and the vector v = (1, 2, 1) that is parallel to the line. Therefore, one can take n = ni x v = (3, -2, 1). The line lies in P, and therefore any of its points can be taken as a particular point of P, for example, Po(1, 0, -1). An equation of P reads 3(x- 1) -2y+ (z+1)= 0 or 3x-2y+z= 2. EXAMPLE 11.25. Find the planes that are perpendicular to the line x = y/2 = -z/2 and have the distance 3 from the point (-1, -2, 2) on the line. SOLUTION: The line is parallel to the vector v = (1, 2, -2). So the planes have the same normal n = v. Particular points in the planes are the points of intersection of the line with the planes. These points are at the distance 3 from ro = (-1, -2, 2), and their position vectors r should satisfy the condition r - rollI= 3. On the other hand, by the vector equation of the line, r = ro + tv and hence ||tv||I= 3 or 3|t|I= 3 or t =+1. So the position vectors of particular points in the planes are r = ro + v or r = (0, 0, 0) and r = (-2, -4, 4). Equations of the planes are x + 2y - 2z = 0 and (x + 2) + 2(y + 4) - 2(z - 4) = 0 or xc + 2y - 2z =-18.D 77.5. Study Problems. Problem 11.28. Let C1 be the line through P1(1,1,1) and parallel to v1 = (1, 2, -1) and let £2 be the line through P2(4, 0, -2) and parallel to v2 = (2, 1,0). Determine whether the lines are parallel, intersecting, or skew and find the line £ that is perpendicular to both L1 and £2 and intersects them. SOLUTION: The vectors vi and v2 are not proportional, and hence the lines are not parallel. One has r12 = PiP2 = (3, -1, -3) and vi x v2 = (1, -2, -3). Therefore, r12" (vi x v2)= 14 / 0, and the lines are skew by Corollary 11.10. Let rt = r1+tv1 be a position vector of a point of 1 and let r= r2 + sv2 be a position vector of a point of £2 as shown in Figure 11.19 (left panel). The line £ is orthogonal to both vectors vi and v2. As it intersects the lines £1 and £2, there should exist a pair of values (t, s) of the parameters at which the vector r8 - rt is parallel to £; that is, the vector r8 - rt becomes orthogonal to vi  80 11. VECTORS AND THE SPACE GEOMETRY G2 FIGURE 11.19. Left: Illustration to Study Problem 11.28. The vectors rs and rt trace out two given skewed lines L and G2, respectively. There are particular values of t and s at which the distance rt -rs becomes minimal. Therefore, the line £ through such points rt and rs is perpendicular to both L1 and £2. Right: Intersection of a line £ and a sphere S. An illustration to Study Problem 11.29. The terminal point of the vector rt traverses the line as t ranges over all real numbers. If the line intersects the sphere, then there should exist a particular value of t at which the components of the vector rt satisfy the equation of the sphere. This equation is quadratic in t, and hence it can have two distinct real roots, or one multiple real root, or no real roots. These three cases correspond to two, one, or no points of intersection. The existence of just one point of intersection means that the line is tangent to the sphere. and v2. The corresponding algebraic conditions are rs - rtvi (r8 - rt) .vi = 4+ 4s - 6t = 0, rs - rtvi (r8 - rt) .v2 = 5 + 5s - 4t = 0. This system has the solution t = 0 and s = -1. Thus, the points with the position vectors rt~o = r1 and rs_1 = r2-v2 = (2, -1, -2) belong to C. So the vector v = rs_1 - rt~o = (1, -3, -1) is parallel to C. Taking a particular point of C to be P1 (whose position vector is r1), the parametric equations read x = 1 + t, y = 1 - 3t, and z = 1 - t. Q Problem 11.29. Consider a line through the origin that is parallel to the vector v = (1, 1,1). Find the part of this line that lies inside the sphere x2 +y2 +z2 -x-2y-3z =9. SOLUTION: The parametric equations of the line are x = t, y = t, z = t. If the line intersects the sphere, then there should exist particular and  77. LINES IN SPACE 81 77. LINES IN SPACE 81 values of t at which the coordinates of a point of the line also satisfy the sphere equation (see Figure 11.19, right panel). In general, parametric equations of a line are linear in t, while a sphere equation is quadratic in the coordinates. Therefore, the equation that determines the values of t corresponding to the points of intersection is quadratic. A quadratic equation has two, one, or no real solutions. Accordingly, these cases correspond to two, one, and no points of intersection, respectively. In our case, 3t2 - 6t = 9 ort2 - 2t =3 and hence t = -1 and t = 3. The points of intersection are (-1, -1, -1) and (3, 3, 3). The line segment connecting them can be described by the parametric equations x = t, y =t, and z =t, where -1< t <3. D 77.6. Exercises. (1) Find parametric equations of the line through the point (1, 2, 3) and perpendicular to the plane x + y + 2z = 1. Find the point of intersection of the line and the plane. (2) Find parametric and symmetric equations of the line of intersection of the planes x + y + z= 1 and 2x - 2y + z= 1. (3) Is the line through the points (1, 2, 3) and (2, -1, 1) perpendicular to the line through the points (0, 1, -1) and (1, 0, 2)? Are the lines intersecting? If so, find the point of intersection. (4) Determine whether the lines x = 1+ 2t, y = 3t, and z = 2 - t and c + 1 = y - 4 = (z - 1)/3 are parallel, skew, or intersecting. If they intersect, find the point of intersection. (5) Find the vector equation of the straight line segment from the point (1, 2, 3) to the point (-1, 1, 2). (6) Let r1 and r2 be position vectors of two points in space. Find the vector equation of the straight line segment from r1 to r2. (7) Consider the plane c + y - z = 0 and a point P = (1, 1, 2) in it. Find parametric equations of the lines through the origin that lie in the plane and are at a distance of 1 unit from P. Hint: A vector parallel to these lines can be taken in the form v = (1, c, 1 + c), where c is to be determined. Explain why! (8) Find parametric, symmetric, and vector equations of the line through (0, 1, 2) that is perpendicular to the line c=c 1+t, y = -1+t, z = 2-2t and parallel to the plane cc + 2y + z =3. (9) Find parametric equations of the line that is parallel to v= (2, -1, 2) and goes through the center of the sphere cc2 + y2 + z2 2cc + 6z - 6. Restrict the range of the parameter to describe the part of the line that is inside the sphere.  82 11. VECTORS AND THE SPACE GEOMETRY (10) Let the line L1 pass through the point A(1, 1, 0) parallel to the vec- tor v = (1, -1, 2) and let the line £2 pass through the point B(2, 0, 2) parallel to the vector w = (-1, 1, 2). Show that the lines are intersect- ing. Find the point C of intersection and parametric equations of the line C3 through C that is perpendicular to L1 and £2. (11) Find parametric equations of the line through (1, 2, 5) that is per- pendicular to the line x - 1 = 1 - y = z and intersects this line. (12) Find parametric equations of the line that bisects the angle of the triangle ABC at the vertex A if A = (1, 1, 1), B = (2, -1, 3), and C = (1, 4, -3). Hint: See exercise 12 in Section 73.7. (13) Find the distance between the lines x = y = z and x + 1 =y/2 = z/3. (14) A small meteor moves with speed v in the direction of a unit vec- tor n. If the meteor passed the point ro, find the condition on 6 such that the meteor hits an asteroid of the shape of a sphere of radius R centered at the point r1. Determine the position vector of the impact point. (15) A projectile is fired in the direction v = (1, 2, 3) from the point (1, 1, 1). Let the target be a disk of radius R centered at (2, 3, 6) in the plane 2x - 3y + 4z = 19. If the trajectory of the projectile is a straight line, determine whether it hits a target in two cases R = 2 and R = 3. (16) Consider a triangle ABC where A = (1, 1, 1), B = (3, 1, -1), and C = (1, 3, 1). Find the area of a polygon DPQB where the vertices D and Q are the midpoints of CB and AB, respectively, and the vertex P is the intersection of the segments CQ and AD. 78. Quadric Surfaces DEFINITION 11.17. (Quadric Surface). The set of points whose coordinates in a rectangular coordinate system satisfy the equation Az2 + By2 +Cz2 + pxy + qxz +vyz + a + ,3y + 7z + D = 0, where A, B, C, p, q, v, a, 3, , and D are real numbers, is called a quadric surface. The equation that defines quadric surfaces is the most general equa- tion quadratic in all the coordinates. This is why surfaces defined by it are called quadric. A sphere provides a simple example of a quadric a =3 # =, and D =-R2, where R is the radius of the sphere. If B =C =1, a =-1, while the other constants vanish, the quadratic equation cc= y2 +z2 defines a circular paraboloid whose symmetry axis  78. QUADRIC SURFACES 83 78. QUADRIC SURFACES 83 is the x axis. On the other hand, if A = B = 1, y = -1, while the other constants vanish, the equation z = x2 + y2 also defines a paraboloid that can be obtained from the former one by a rotation about the y axis through the angle 7r/2 under which (x, y, z) - (z, y, -x) so that x = y2 + z2 z = y2 + x2. Thus, there are quadric surfaces of the same shape described by different equations. The task here is to classify all the shapes of quadric surfaces. The shape does not change under its rigid rotations and translations. On the other hand, the equation that describes the shape would change under translations and rotations of the coordinate system. The freedom in choosing the coordinate system can be used to simplify the equation for quadric surface and obtain a classification of different shapes described by it. 78.1. Quadric Cylinders. Consider first a simpler problem in which the equation of a quadric surface does not contain one of the coordinates, say, z (i.e., C= q =v =7=0). Then the set S, S ={(x,y,z)Ax2+ By2+pxy+ ax+#y+ D = 0}, is the same curve in every horizontal plane z = const. For example, if A = B = 1, p = 0, and D = -R2, the cross section of the surface S by any horizontal plane is a circle x2 + y2 = R2. So the surface S is a cylinder of radius R that is swept by the circle when the latter is shifted up and down parallel to the z axis. Similarly, a general cylindrical shape is obtained by shifting a curve in the xy plane up and down parallel to the z axis. THEOREM 11.8. (Classification of Quadric Cylinders). A general equation for quadric cylinders S= {(x,y,z)Ax2+ By2+pxy+ax+#y+D=0 can be brought to one of the standard forms A'x2 + B'y2 + D' = 0 or A'x2 + 3'y = 0 by rotation and translation of the coordinate system, provided A, B, and p do not vanish simultaneously. In particular, these forms define the quadric cylinders: y - ax2 = 0 (parabolic cylinder), 13' 0, x2 g2 A' B' £ + 1 (elliptic cylinder),' <0' < 0, D' 0, a2 b2 D' D' 2 2 a- = 1 (hyperbolic cylinder), A'B' < 0, D' / 0. The shapes of quadric cylinders are shown in Figure 11.20. Other than quadric cylinders, the standard equations may define planes or a  84 11. VECTORS AND THE SPACE GEOMETRY z z z -a.. b a~a. FIGURE 11.20. Left: Parabolic cylinder. The cross section by any horizontal plane z = const is a parabola y = ax2. Middle: An elliptic cylinder. The cross section by any horizontal plane z = const is an ellipse x2/a2 + y2/b2 - 1. Right: A hyperbolic cylinder. The cross section by any horizontal plane z = const is a hyperbola x2/a2- y2/b2=1. line for some particular values of the constants A', B', D', and 3'. For example, for A' = -B' = 1 and D' = 0, the equation x2 y2 defines two planes x + y= 0. For A' = B' = 1 and D' = 0, the equation x2 + y2 = 0 defines the line x = y = 0 (the z axis). Proof of Theorem 11.8. Let (x, y) be coordinates in the coordinate sys- tem obtained by a rotation through an angle 0. The equation of S in the new coordinate system is obtained by the transformation: (x,y) - (zcos#-ysin#, ycos#+zsin#) according to Study Problem 11.2. The angle # can be chosen so that the equation for S does not contain the "mixed" term xy. Indeed, consider the transformation of quadratic terms in the equation for S: x2 - cos25 #x2 + sin2 05Y2 - 2 sin # cos5 #xy = j(1 + cos(2#))z2 + 2(1 - cos(2#))y2 - sin(2#)zy, y- sin2# x2 +cos2 0 y2 + 2sini0cos# zcy = j(1 - cos(2#))X2 + j(1 + cos(2#))y2 + sin(2#)ccy, zcy - sin # cos #5(xc2 _ 2) + (cos2 # - sin2 il)ccy = sin(2#) (cc2 _ y2) + cos(2#)ccy. After the transformation, the coefficient at zcy becomes: p -a p' =(B - A) sin(2#i) + p cos(2#i).  78. QUADRIC SURFACES 85 78. QUADRIC SURFACES 85 The angle # is set so that p' = 0 or p) 7r (11.20) tan(2#) = A1 and # =- if A =B. A- B 4 Similarly, the coefficients A and B (the factors at x2 and y2) and a and 3 (the factors at x and y) become A - A' = 2[A + B + (A - B) cos(2#) + p sin(2#)], B ->B'=([A+B-(A-B)cos(2)-psin(2#)], a - a'= -acos#+#sin#, #3 -# j3'=#cos#- asin#, where # satisfies (11.20). Depending on the values of A, B, and p, the following three cases can occur. First, A' = B' = 0, which is only possible if A = B = p = 0. Note that the combination Act2 + By2 + pxy becomes A'z2 + B'y2 + p'zy in a rotated coordinate system. If A' = B' = p' = 0 for a particular # (chosen to make p' = 0), then this combination should be identically 0 in any other coordinate system obtained by rotation. In this case, S is defined by the equation ax + /3y + D = 0, which is a plane parallel to the z axis. Second, only one of A' and B' vanishes. For establishing the shape, it is irrelevant how the horizontal and vertical coordinates in the plane are called. So, without loss of generality, put B' = 0. In this case, the equation for S assumes the form A'z2 + a'z + ,3'y + D = 0 or, by completing the squares, A'xx - o +#'(y - Yo)= 0 , zo=2A' yo =A' - D. After the translation of the coordinate system x x - + zo and y - y + yo, the equation is reduced to A'z2 + ,3'y = 0. If /' / 0, it defines a parabola y -ax2 = 0, where a = -A'//'. Third, both A' and B' do not vanish. Then, after the completion of squares, the equation A'z2 + B'y2 + a'ct+#/3'y + D = 0 has the form A'(x - z0)2 + B'(y - yo)2 + D' = 0, where zo = Z-/,,yo =-',, and D' =-D+ {(A'ct+ B'yo). Finally, after the translation of the origin to the point (co, yo), the equation becomes A'z2 + B'y2 + D'= 0. If D' =0, then this equation defines two straight lines y =+mnz, where mr= (-A'/B')-1/2, provided A' and B' have opposite signs (otherwise, the equation has the solution ct= y =0 (a line)). If D' / 0, then the equation can be written as (-A'/D')t2 + (-B'/D')y2 =1. One can  86 11. VECTORS AND THE SPACE GEOMETRY always assume that A'/D' < 0. Note that the rotation of the coordinate system through the angle r/2 swaps the axes, (x, y) - (y, -x), which can be used to reverse the sign of A'/D'. Now put -A'/D' = 1/a2 and B'/D'_=+1/b2 (depending on whether B'/D' is positive or negative) so that the equation becomes 2 2 + = 1. a2 b2 When the plus is taken, this equation defines an ellipse. When the minus is taken, this equation defines a hyperbola. 78.2. Classification of General Quadric Surfaces. The classification of general quadric surfaces can be carried out in the same way. The general quadratic equation can be written in the new coordinate system that is obtained by a translation (11.1) and a rotation (11.13). The rotational freedom (three parameters) can be used to eliminate the "mixed" terms: p p' = 0, q - q'= 0, and v - v'= 0. After this rotation, the linear terms are eliminated by a suitable translation, provided A', B', and C' do not vanish. The corresponding technicalities can be carried out best by linear algebra methods. So the final result is given without a proof. THEOREM 11.9. (Classification of Quadric Surfaces). By rotation and translation of a coordinate system, a general equation for quadric surfaces can be brought into one of the standard forms: A'x2+ B'y2+C'z2+ D' =0 or A'x2+ B'y2+/'z =0. In particular, the standard forms describe quadric cylinders and the following six surfaces: x2 2 z2 a2 +b2 + 2 = 1 (ellipsoid), z2 x2 2 7 -2 + (elliptic double cone), x2 2 2 a2 + Lc 1 (hyperboloid of to sheet), a2 2 z2 2 2 + c 2 =1 (hyperboloid of toesheet), z a 2 2 - - + 2(elliptic paraboloid), c a2 b (hyperbolic paraboloid).  78. QUADRIC SURFACES 87 78. QUADRIC SURFACES 87 FIGURE 11.21. Left: An ellipsoid. A cross section by any coordinate plane is an ellipse. Right: An elliptic double cone. A cross section by a horizontal plane z = const is an ellipse. A cross section by any vertical plane through the z axis is two lines through the origin. The six shapes are the counterparts in three dimensions of the conic sections in the plane discussed in Calculus II. Other than quadric cylin- ders or the above six shapes, a quadratic equation may also define planes and lines for particular values of its parameters. 78.3. Visualization of Quadric Surfaces. The shape of a quadric surface can be understood by studying intersections of the surface with the coordinate planes x = xo, y = yo, and z = zo. These intersections are also called traces. An Ellipsoid. If a2 = b2 = c2 = R2, then the ellipsoid becomes a sphere of radius R. So, intuitively, an ellipsoid is a sphere "stretched" along the coordinate axes (see Figure 11.21, left panel). Traces of an ellipsoid in the planes x = xo, xo < a, are ellipses 2 2 2 z2 _x0>o + 2 kor + z =1, k=1- >0 b2 2 (bk)2 (ck)2 a2 As the plane x = xo gets closer to x = a or x = -a, k becomes smaller and so does the ellipse because its major axes b k and c k decrease. Apparently, the traces in the planes x = ±a consist of a single point (±a, 0, 0), and there is no trace in any plane x = xo if xo > a. Traces in the planes y = yo and z = zo are also ellipses and exist only if  88 11. VECTORS AND THE SPACE GEOMETRY FIGURE 11.22. Left: A hyperboloid of section by a horizontal plane z =const is section by a vertical plane x =const or y bola. Right: A hyperboloid of two sheets. section by a horizontal plane is an ellipse. a vertical plane is a hyperbola. one sheet. A cross an ellipse. A cross =const is a hyper- A nonempty cross A cross section by FIGURE 11.23. Left: An elliptic paraboloid. A nonempty cross section by a horizontal plane is an ellipse. A cross sec- tion by a vertical plane is a parabola. Right: A hyperbolic paraboloid (a "saddle"). A cross section by a horizontal plane is a hyperbola. A cross section by a vertical plane is a parabola.  78. QUADRIC SURFACES 89 78. QUADRIC SURFACES 89 yo < b and |z0| < c. Thus, the characteristic geometrical property of an ellipsoid is that its traces are ellipses. A Paraboloid. Suppose c > 0. Then the paraboloid lies above the xy plane because it has no trace in all horizontal planes below the xy plane, z = z0 < 0. In the xy plane, its trace contains just the origin. Horizontal traces (in the planes z = z0) of the paraboloid are ellipses: X2 2 x2 2 zo L+1 =k or + =1, k=- c>0. a2 b2 (ak)2 (bk)2 1 k c,> The ellipses become wider as z0 gets larger because their major axes a k and bv kgrow with increasing k. Vertical traces (traces in the planes x =x0 and y = yo) are parabolas: z - kc c 2 k x0 and z - kc = c x2 k= . b2ka 2 k a2 b2 Similarly, a paraboloid with c < 0 lies below the xy plane. So the characteristic geometrical property of a paraboloid is that its horizon- tal traces are ellipses, while its vertical ones are parabolas (see Fig- ure 11.23, left panel). If a = b, the paraboloid is also called a circular paraboloid because its horizontal traces are circles. A Double Cone. The horizontal traces are ellipses: x2 2 2x2 2 zo 2t+ =k2 or + =1 k=-. a2 b2 (ak)2 (bk)2 c So as Izol grows, that is, as the horizontal plane moves away from the xy plane (z = 0), the ellipses become wider. In the xy plane, the cone has a trace that consists of a single point (the origin). The vertical traces in the planes x = 0 and y = 0 are a pair of lines z =+(c/b)y and z =+(c/a)z. Furthermore, the trace in any plane that contains the z axis is also a pair of straight lines. Indeed, take parametric equations of a line in the xy plane through the origin, = vit, y = v2t. Then the z coordinate of any point of the trace of the cone in the plane that contains the z axis and this line satisfies the equation z2/c2 = [(vi/a)2 + (v2/b)2]t2 or z tv3t, where v3 = c(v1/a)2 + (v2/b)2. So the points of intersection, c = vit, y = 'v2t, z =+tv- t for all real t, form two straight lines through the origin. Given an ellipse in a plane, consider a line through the center of the ellipse that is perpendicular to the plane. Fix a point P on this line that does not coincide with the point of intersection of the line and the plane. Then a double cone is the surface that contains  90 11. VECTORS AND THE SPACE GEOMETRY all lines through P and points of the ellipse. The point P is called the vertex of the cone. So the characteristic geometrical property of a cone is that horizontal traces are ellipses; its vertical traces in planes through the axis of the cone are straight lines (see Figure 11.21, right panel). Vertical traces in the planes x =zo 0 and y = yo 0 are hyperbolas y2/b2-z2/c2 = k, where k = -x/a2, and x2/a2-z2/c2 = k, where k = -yo/b2. Recall in this regard conic sections studied in Calculus II. If a = b, the cone is called a circular cone. In this case, vertical traces in the planes containing the cone axis are a pair of lines with the same slope that is determined by the angle # between the axis of the cone and any of these lines: c/b = c/a = cot #. The equation of a circular double cone can be written as z2 = cot2(#)(x2 + y2) , 0< # < 7r/2. The equation for an upper or lower cone of the double circular cone is z=+cot(#)/z2 + y2. A Hyperbolic Paraboloid. The horizontal traces are hyperbolas: x2 2 z 2 y _ a2 b2k '~ c Suppose c > 0. If zo > 0 (horizontal planes below the xy plane), then k > 0. In this case, the hyperbolas are symmetric about the x axis, and their branches lie either in x > 0 or in x < 0 (i.e., they do not intersect the y axis) because x2/a2 y2/b2 + k > 0 (x does not vanish for any y). If zo < 0, then k < 0. In this case, the hyperbolas are symmetric about the y axis, and their branches lie either in y > 0 or in y < 0 (i.e., they do not intersect the x axis) because y2/b2 = x2/a2 - k > 0 (y cannot vanish for any x). Vertical traces in the planes x = zo and y = yo are upward and downward parabolas, respectively: 2 2 c 2 czc0 c 2 __cy0 z-zo=- y2, zo= a2 and z-zo=-a2x2, zo= ° b. Take the parabolic trace in the zz plane z = (c/a2)z2 (i.e., in the plane y = yo = 0). The traces in the perpendicular planes x = zo are parabolas whose vertices are (zco, 0, zo), where zo (c/a2)cc!, and hence lie on the parabola z =(c/a2)cc2 in the zzc plane. This observation suggests that the hyperbolic paraboloid is swept by the parabola in the zy plane, z =-(c/b2)y2, when the latter is moved parallel so that its vertex remains on the parabola z =(c/a2)cc2 in the perpendicular  78. QUADRIC SURFACES 91 78. QUADRIC SURFACES 91 plane. The obtained surface has the characteristic shape of a "saddle" (see Figure 11.23, right panel). A Hyperboloid of One Sheet. Traces in horizontal planes z = zo are ellipses: x 2 2 x2 2 + =k2 or + 11+z/c2>1. a2 b2 (ka)2 (kb)2 - ,k 1+z/2>1 The ellipse is the smallest in the xy plane (zo = 0 and k = 1). The major axes of the ellipse, ka and kb, grow as the horizontal plane gets away from the xy plane because k increases. The surface looks like a tube with ever-expanding elliptic cross section. The vertical cross section of the "tube" by the planes x = 0 and y = 0 are hyperbolas: y2 z2 x2 z2 2 - 2=1 and 2 =1. b2 c2 a2 c2 So the characteristic geometrical property of a hyperboloid of one sheet is that its horizontal traces are ellipses and its vertical traces are hy- perbolas (see Figure 11.22, left panel). A Hyperboloid of Two Sheets. A distinctive feature of this surface is that it consists of two sheets (see Figure 11.22, right panel). Indeed, the trace in the plane z = zo satisfies the equation x2 2 z 2 +a 2 -c2 which has no solution if z /c2 -1 < 0 or -c < zo < c. So one sheet lies above the plane z = c, and the other lies below the plane z = -c. Hori- zontal traces in the planes z = zo > c or z = zo < -c are ellipses whose major axes increase with increasing Izol. The upper sheet touches the plane z = c at the point (0, 0, c), while the lower sheet touches the plane z = -c at the point (0, 0, -c). These points are called vertices of a hyperboloid of two sheets. Vertical traces in the planes x = 0 and y = 0 are hyperbolas: z2 y2 z2 x2 = 1 and =1. c2 b2 c2 a2 Thus, the characteristic geometrical properties of hyperboloids of one sheet and two sheets are similar, apart from the fact that the latter one consists of two sheets. Also, in the asymptotic region Iz| >> c, the hyperboloids approach the surface of the double cone. Indeed, in this case, z2/c2 >> 1, and hence the equations x2/a2 + y2/b2 =+1 + z2/c2 can be well approximated by the double-cone equation (+1 can be  92 11. VECTORS AND THE SPACE GEOMETRY neglected on the right side of the equations). In the region z > 0, the hyperboloid of one sheet approaches the double cone from below, while the hyperboloid of two sheets approaches it from above. For z < 0, the converse holds. In other words, the hyperboloid of two sheets lies "inside" the cone, while the hyperboloid of one sheet lies "outside" it. 78.4. Shifted Quadric Surfaces. If the origin of a coordinated system is shifted to a point (o, yo, zo) without any rotation of the coordinate axes, then the coordinates of a point in space are translated (x, y, z) - (X -o, Y -Yo, z-zo). Therefore, any equation of the form f(x, y, z) = 0 becomes f(x - xo, y - yo, z - zo) = 0 in the new coordinate system. If the equation f(x, y, z) = 0 defines a surface in space, then the equation f (X - zo, y - Yo, z - zo) = 0 defines the very same surface that has been translated as the whole (each point of the surface is shifted by the same vector (co, yo, zo)). For example, the equation (x-cco)2 (Y-Yo)2 (z-zo)2 a2 b2 C2 describes a double elliptic cone whose axis is parallel to the z axis and whose vertex is at (co, yo, zo). Equations of shifted quadric surfaces can be reduced to the standard form by completing the squares. EXAMPLE 11.26. Classify the quadric surface 9X2 + 36y2 + 4z2 - 18x + 72y + 16z + 25 = 0. SOLUTION: Let us complete the squares for each of the variables: 9x2 - 18x = 9(x2 - 2x) - 9[(x - 1)2 - 1] =9(x - 1)2 - 9 36y2 + 72y = 36(y2 + 2y) = 36[(y + 1)2 - 1] = 36(y + 1)2 - 36, 4z2 + 16z = 4(z2 + 4z) = 4[(z + 2)2 - 4] = 4(z + 2)2 - 16. The equation becomes 9(x - 1)2 + 36(y + 1)2 + 4(z + 2)2 = 36, and, by dividing it by 36, the standard form is obtained (cc-i)2 +(y + 1)2 + (z+2)21 16 9 This equation describes an ellipsoid with the center at (1, -1, -2) and major axes a = 4, b = 1, and c = 3. D EXAMPLE 11.27. Classify the surface cc2 +22 - 4y -_z 0 SOLUTION: By completing the squares 2y2 - 4y =2(y2 - 2y) =2[(y - 1)2 -_1  78. QUADRIC SURFACES 93 78. QUADRIC SURFACES 93 the equation can written in the form c2 c2+2(-1)2-2-2z _1 or X2 (y-1) - 2 - 2z = 0 or z + 1= + y-2) 2 which is an elliptic paraboloid with the vertex at (0, 1, -1) because it is obtained from the standard equation z = x2/2 + y2 by the shift of the coordinate system (x, y, z) - (x, y - 1, z + 1). D EXAMPLE 11.28. Classify the surface 2 -4y2+z2-2x-8z+1 0. SOLUTION: By completing the squares, the equation is transformed to (x - 1)2- 1 -4y2+4(z+ 1)2 -4+1 = 0, (x _-1)22 2 +(z + 1)2-y 1, 4 which is a hyperboloid of one sheet whose axis is the line through (1, 0, -1) that is parallel to the y axis. D EXAMPLE 11.29. Use an appropriate rotation in the zy plane to reduce the equation z = 2xy to the standard form and classify the surface. SOLUTION: Let (z', y') be coordinates in the rotated coordinate system through the angle # as depicted in Figure 11.3 (right panel). In Study Problem 11.2, the old coordinates (x, y) are expressed via the new ones (z', y'): x = z' cos # - y' sin #, y = y' cos #+ zx' sin #. In the new coordinate system, the equation z = 2xy = 2x'2 cos # sin # - 2y'2 cos # sin 0 + 2x'y'(cos2 # - sin2 ) would have the standard form if the coefficient at z'y' vanishes. So put # = r/4. Then 2 sin # cos # = sin(2#) = 1 and z =cz'2 - y'2, which is the hyperbolic paraboloid. D 78.5. Study Problems. Problem 11.30. Classify the quadric surface 3x2 + 3z2 - 2xz = 4. SOLUTION: The equation does not contain one variable (the y coordi- nate). The surface is a cylinder parallel to the y axis. To determine the type of cylinder, consider a rotation of the coordinate system in the cvz plane and choose the rotation angle so that the coefficient at the cvz term vanishes in the transformed equation. According to (11.20), A ,p=-,adhne#=g/.Te '=( p/  94 11. VECTORS AND THE SPACE GEOMETRY and B' = (A + B + p)/2 = 2. So, in the new coordinates, the equa- tion becomes 4x2 + 2z2 = 4 or x2 + z2/2 = 1, which is an ellipse with semiaxes a = 1 and b = 2. The surface is an elliptic cylinder. Q Problem 11.31. Classify the quadric surface x2 - 2x + y + z = 0. SOLUTION: By completing the squares, the equation can be trans- formed into the form (x -1)2 + (y -1) + z = 0. After shifting the origin to the point (1, 1, 0), the equation becomes x2 +y - z = 0. Consider ro- tations of the coordinate system about the x axis: y - cos #y + sin #z, z -_ cos #z - sin #y. Under this rotation, y - z -- (cos # + sin #)y + (sin # - cos #)z. Therefore, for #= /4, the equation assumes one of the standard forms x2 + 2 y = 0, which corresponds to a parabolic cylinder. Q Problem 11.32. Classify the quadric surface x2+z2-2x+2z-y = 0. SOLUTION: By completing the squares, the equation can be trans- formed into the form (x - 1)2 + (z + 1)2 - (y + 2) = 0. The latter can be brought into one of the standard forms by shifting the origin to the point (1, -2, -1): x2 + z2 = y, which is a circular paraboloid. Its symmetry axis is parallel to the y axis (the line of intersection of the planes x = 1 and z = -1), and its vertex is (1, -2, -1). Q ne (1, 2, 0) FIGURE 11.24. An illustration to Study Problem 11.33. The vector no rotates about the vertical line so that the line through (1, 2, 0) and parallel to v0 sweeps a double cone with the vertex at (1, 2, 0). Problem 11.33. Sketch and/or describe the set of points in space formed by a family of lines through the point (1, 2, 0) and parallel to vo = (cos 0, sin0, 1), where 0 E (0, 2r] labels lines in the family.  78. QUADRIC SURFACES 95 78. QUADRIC SURFACES 95 SOLUTION: The parametric equations of each line are x = 1+ t cos 0, y = 2 + t sinOe, and z = t. Therefore, (x - 1)2 + (y - 2)2 = z2 for all values of t and 0. Thus, the lines form a double cone whose axis is parallel to the z axis and whose vertex is (1, 2, 0). Alternatively, one could notice that the vector v0 rotates about the z axis as 0 changes. Indeed, put v = 6+ ez, where 6= (cos 0, sin 0, 0) is the unit vector in the xy plane as shown in Figure 11.24. It rotates as 0 changes, making a full turn as 0 increases from 0 to 27. So the set in question can be obtained by rotating a particular line, say, the one corresponding to O = 0, about the vertical line through (1, 2, 0). The line sweeps the double cone. D 78.6. Exercises. (1) Use traces to sketch and identify each of the following surfaces: (i) y2 = X2 + 9z2 (ii) y = x2 - z2 (iii) 4x2 + 2y2 + z2 4 (iv) X2 - y2 + z2 -1 (v) y2 + 4z2 = 16 (vi) X2 _ y2 + z2 1 (vii) X2 + 4y2 - 9z2 + 1 = 0 (viii) X2 + z = 0 (ix) X2 + 9y2 + z = 0 (x) y2 - 4z2 = 16 (2) Reduce each of the following equations to one of the standard form, classify the surface, and sketch it: (i) x2 + y2 + 4z2 - 2x + 4y = 0 (ii) x2 - y2 + z2 + 2x - 2y + 4z + 2 = 0 (iii) x2 + 4y2 - 6x + z = 0 (iv) y2 - 4z2 + 2y - 16z = 0 (v) X2 - Y2 + z2 - 2x + 2y = 0 (3) Use rotations in the appropriate coordinate plane to reduce each of the following equations to one of the standard form and classify the surface: (i)G6xy +cc2 + y2 ~1 (ii) 3y2 + 3z2 - 2yz =1 (iii) cc - yz =0 (iv) ccy - z2=0 (v) 2xcz+c2 _y2 = 0  96 11. VECTORS AND THE SPACE GEOMETRY (4) Find an equation for the surface obtained by rotating the line y = 2x about the y axis. Classify the surface. (5) Find an equation for the surface obtained by rotating the curve y = 1+ z2 about the y axis. Classify the surface. (6) Find equations for the family of surfaces obtained by rotating the curves x2 - 4y2 = k about the y axis where k is real. Classify the surfaces. (7) Find an equation for the surface consisting of all points that are equidistant from the point (1, 1, 1) and the plane z = 2. (8) Sketch the solid region bounded by the surface z = cc2 + y2 from below and by x2+ y2 + z2 - 2z = 0 from above. (9) Sketch the solid region bounded by the surfaces y = 2 - x2 - z2, y = X2 + z2 - 2, and lies inside the cylinder x2 + z2 1. (10) Sketch the solid region bounded by the surfaces x22+ y2 = R2 and x2 + z2= R2. (11) Find an equation for the surface consisting of all points P for which the distance from P to the y axis is twice the distance from P to the zz plane. Identify the surface. (12) Show that if the point (a, b, c) lies on the hyperbolic parab- oloid z = y2 - x2, then the lines through (a, b, c) and parallel to v = (1, 1, 2(b - a)) and u = (1, -1, -2(b - a)) both lie entirely on this paraboloid. Deduce from this result that the hyperbolic parab- oloid can be generated by the motion of a straight line. Show that hyperboloids of one sheet, cones, and cylinders can also be obtained by the motion of a straight line. Remark. The fact that hyperboloids of one sheet are generated by the motion of a straight line is used to produce gear transmissions. The cogs of the gears are the generating lines of the hyperboloids. (13) Find an equation for the cylinder of radius R whose axis goes through the origin and is parallel to a vector v. (14) Show that the curve of intersection of the surfaces x22- 2y2 + 3z2 - 2x + y - z = 1 and 2x2 - 4y2 + 6z2 + x - y + 2z =4 lies in a plane. (15) What are the curves that bound the projections of the ellipsoid x2 + y2 + z2 - zy = 1 on the coordinate planes?  CHAPTER 12 Vector Functions 79. Curves in Space and Vector Functions To describe the motion of a pointlike object in space, its position vectors must be specified at every moment of time. A vector is defined by three components in a coordinate system. Therefore, the motion of the object can be described by an ordered triple of real-valued func- tions of time. This observation leads to the concept of vector-valued functions of a real variable. DEFINITION 12.1. (Vector Function). Let D be a set of real numbers. A vector function r(t) of a real variable t is a rule that assigns a vector to every value of t from D. The set D is called the domain of the vector function. The most commonly used rules to define a vector function are al- gebraic rules that specify components of a vector function in a coordi- nate system as functions of a real variable: r(t) = (x(t), y(t), z(t)). For example, r (t) = ( l - t , ln(t), t2) or x(t) = V1 - t , y(t) = ln (t) , z(t) = t2. Unless specified otherwise, the domain of the vector function is the set D of all values of t at which the algebraic rule makes sense; that is, all three components can be computed for any t from D. In the above example, the domain of x(t) is -oc < t < 1, the domain of y(t) is 0 < t < oc, and the domain of z(t) is -oc < t < oc. The domain of the vector function is the intersection of the domains of its components: D = (0, 1]. Suppose that the components of a vector function r(t) are contin- uous functions on an interval D = I = [a, b]. Consider all vectors r(t), as t ranges over I, positioned so that their initial points are at a fixed point (e.g., the origin of a coordinate system). Then the ter- minal points of the vectors r(t) form a curve in space as depicted in Figure 12.1 (left panel). The simplest example is provided by the mo- tion along a straight line, which is described by a linear vector function 97  98 12. VECTOR FUNCTIONS z z (t) r(a) r(b) r R ze3 X x FIGURE 12.1. Left: The terminal point of a vector r(t) whose components are continuous functions of t traces out a curve in space. Right: Graphing a space curve. Draw a curve in the xy plane defined by the parametric equations x = x(t), y = y(t). It is traced out by the vector R(t) (x(t), y(t), 0). This planar curve defines a cylindrical surface in space in which the space curve in question lies. The space curve is obtained by raising or lowering the points of the planar curve along the surface by the amount z(t), that is, r(t) = R(t) + 83z(t). In other words, the graph z = z(t) is wrapped around the cylindrical surface. r(t) = ro + tv. Thus, the range of a vector function defines a curve in space, and a graph of a vector function is a curve in space. 79.1. Graphing Space Curves. To visualize the shape of a curve C traced out by a vector function, it is convenient to think about r(t) as a trajectory of motion. The position of a particle in space may be determined by its position in a plane and its height relative to that plane. For example, this plane can be chosen to be the xy plane. Then r(t) = (x(t), y(t), z(t)) (x(t), y(t), 0) + (0, 0, z(t)) = R(t) + z(t)e3. Consider the curve defined by the parametric equations x = x(t), y = y(t) in the xy plane. One can mark a few points along the curve corresponding to particular values of t, say, P, with coordinates (x(tn), y(tn)), n = 1, 2, ..., N. Then the corresponding points of the curve C are obtained from them by moving the points Pn along the direction normal to the plane (i.e., along the z axis in this case) by the  79. CURVES IN SPACE AND VECTOR FUNCTIONS 99 amount z(tn); that is, Pn goes up if z(tn) > 0 or down if z(tn) < 0. In other words, as a particle moves along the curve x= x(t), y = y(t), it ascends or descends according to the corresponding value of z(t). The curve can also be visualized by using a piece of paper. Consider a general cylinder with the horizontal trace being the curve x = x(t), y = y(t), like a wall of the shape defined by this curve. Then make a graph of the function z(t) on a piece of paper (wallpaper) and glue it to the wall so that the t axis of the graph is glued to the curve x = x(t), y = y(t) while each point t on the t axis coincides with the correspond- ing point (x(t), y(t)) of the curve. After such a procedure, the graph of z(t) along the wall would coincide with the curve C traced out by r(t). The procedure is illustrated in Figure 12.1 (right panel). EXAMPLE 12.1. Graph the vector function r = (cost, sin t, t), where t ranges over the real line. SOLUTION: It is convenient to represent r(t) as the sum of a vector in the xy plane and a vector parallel to the z axis. In the xy plane, the curve x = cos t, y = sin t is the circle of unit radius traced out counter- clockwise so that the point (1, 0, 0) corresponds to t = 0. The circular motion is periodic with period 27. The height z(t) = t rises linearly as the point moves along the circle. Starting from (1, 0, 0), the curve makes one turn on the surface of the cylinder of unit radius climbing up by 27. Think of a piece of paper with a straight line depicted on it that is wrapped around the cylinder. Thus, the curve traced by r(t) lies on the surface of a cylinder of unit radius and periodically winds about it climbing by 27 per turn. Such a curve is called a helix. The procedure is shown in Figure 12.2. D 79.2. Limits and Continuity of Vector Functions. DEFINITION 12.2. (Limit of a Vector Function). A vector ro is called the limit of a vector function r(t) as t - to if lim |r(t) - roll = 0 ; t-ato the limit is denoted as lim-to r(t) = ro. The left and right limits, limt-i_ r(t) and limt~t+ r(t), are defined similarly. This definition says that the length or norm of the vector r(t) - r0 approaches 0 as t tends to to. The norm of a vector vanishes if and only if the vector is the zero vector. Therefore, the following theorem holds.  100 12. VECTOR FUNCTIONS y z z tt = 3r/2 = 27r t =27r 37r/2 - x 7r/2 7r 37r/2 27r FIGURE 12.2. Graphing a helix. Left: The curve R(t) (cos t, sin t, 0) is a circle of unit radius, traced out counter- clockwise. So the helix lies on the cylinder of unit radius whose symmetry axis is the z axis. Middle: The graph z = z(t) = t is a straight line that defines the height of helix points relative to the circle traced out by R(t). Right: The graph of the helix r(t) = R(t) + z(t)e3. As R(t) traverses the circle, the height z(t) = t rises linearly. So the helix can be viewed as a straight line wrapped around the cylinder. THEOREM 12.1. (Limit of a Vector Function). Let r(t) = (x(t), y(t), z(t)) and let ro = (xo, yo, zo). Then the limit of a vector function exists if and only if the limits of its components exist: lim r(t) = ro lim x(t) = xo , lim y(t) = yo , lim z(t) = zo. t->to t->to t->to t->to This theorem reduces the problem of finding the limit of a vector function to the problem of finding the limits of three ordinary functions. EXAMPLE 12.2. Let r(t) = (sin(t)/t , t ln t , (et - 1 - t)/t2). Find the limit of r(t) as t -- 0+ or show that it does not exist. SOLUTION: The existence of the limits of the components of the given vector function can be investigated by l'Hospital's rule: sin t (sin t)' cos t lim = lim = lim = 1, t-0o+ t t-,o+ (t)' t-o-+ 1 In t (ln t)' t-_ lim t ln t = lim = lim = lim = - lim t = 0, t-->0+ t-->0+ t-1 t-'0+ (t-1)' t-so+ -t-2 t,>+ = l mt im lim 1 t->0+ t2 t-_o+ 2t t->+o 2 2' where l'Hospital's rule has been used twice to calculate the last limit. Therefore, limto+ r(t) = (1, 0, 1/2). Q  79. CURVES IN SPACE AND VECTOR FUNCTIONS 101 DEFINITION 12.3. (Continuity of a Vector Function). A vector function r(t), t e [a, b], is said to be continuous at t = to E [a,b] if lim r(t) = r(to) . t->to A vector function r(t) is continuous in the interval [a, b] if it is contin- uous at every point of [a, b]. By Theorem 12.1, a vector function is continuous if and only if all its components are continuous functions. EXAMPLE 12.3. Let r(t) = (sin(2t)/t , t2, et) for all t / 0 and r(0) (1, 0,1). Determine whether this vector function is continuous. SOLUTION: The components y(t) = t2 and z(t) = et are continuous for all real t and y(O) = 0 and z(0) = 1. The component x(t) = sin(2t)/t is continuous for all t / 0 because the ratio of two continuous functions is continuous. By l'Hospital's rule, sin(2t) 2 cos(2t) limi(t) = lim lim = 2 - limi(t) / x(0) = 1; t-o two t two 1 two that is, x(t) is not continuous at t = 0. Thus, r(t) is continuous everywhere, but t = 0. D 79.3. Space Curves and Continuous Vector Functions. A curve connect- ing two points in space as a point set can be obtained as a continuous transformation (or a deformation without breaking) of a straight line segment in space. Conversely, every such space curve can be continu- ously deformed to a straight line segment. So a curve connecting two points in space is a continuous deformation of a straight line segment, and this deformation has a continuous inverse. A straight line segment can be viewed as an interval a < t < b (a set of real numbers between a and b). Its continuous deformation can be described by a continuous vector function r(t) on [a, b]. So the range of a continuous vector function defines a curve in space. Conversely, given a curve C as a point set in space, one might ask the question: What is a vector function that traces out a given curve in space? The answer to this question is not unique. For example, a line £ as a point set in space is uniquely defined by its particular point and a vector v parallel to it. If r1 and r2 are position vectors of two particular points of £, then both vector functions ri(t) =r1 + tv and r2(t) = 2- 2tv trace out the same line £ because the vectors -2v and v are parallel. The following, more sophisticated example is also of interest. Sup- pose one wants to find a vector function that traces out a semicircle of  102 12. VECTOR FUNCTIONS 102 12. VECTOR FUNCTIONS radius R. Let the semicircle be positioned in the upper part of the xy plane: x2+ y2 = R2 and y >0. The following three vector functions trace out the semicircle: r1 (t) = (t, /R2 -t2 , 0) , -R G t G R , r2(t) = (R cos t, R sin t, 0) , 0 < t < r r3(t) = (-Rcost, R sin t, 0) , 0 < t < Tr. This is easy to see by noting that the y components are nonnegative in the specified intervals and the norm of these vector functions is constant for any value oft: ||r2(t)||2 = R2 or xZ(t)+yZ(t) = R2, where i = 1, 2, 3. The latter means that the endpoints of the vectors r2 (t) always remain on the circle of radius R. It can therefore be concluded that there are many vector functions whose ranges define the same curve in space. Another observation is that there are vector functions that trace out the same curve in opposite directions at t increases from its smallest value a to its largest value b. In the above example, the vector func- tion r2(t) traces out the semicircle counterclockwise, while the func- tions ri(t) and r3(t) do so clockwise. So a vector function defines the orientation of a curve. However, this notion of the orientation of a curve should be regarded with caution because a vector function may traverse its range (or a part of it) several times. For example, the vec- tor function r(t) = (R cost, RIsint|, 0) traces out the semicircle twice, back and forth, when t ranges from 0 to 27. The vector function r(t) = (t2, t2, t2) is continuous on the interval [-1, 1] and traces out the straight line segment, x = y = z, between the points (0, 0, 0) and (1, 1, 1) twice. To emphasize the noted differences between space curves as point sets and continuous vector functions, the notion of a parametric curve is introduced. DEFINITION 12.4. (Parametric Curve). A continuous vector function on an interval is called a parametric curve. If a continuous vector function r(t) = (x(t), y(t), z(t)), a < t < b, establishes a one-to-one correspondence between an interval [a, b] and a space curve C, then the vector function is also called a parameterization of the curve C, the equations x = x(t), y = y(t), and z = z(t) are called parametric equations of C, and t is called a parameter. As noted, a parameterization of a given space curve is not unique, and there are different parametric equations that describe the very same space  79. CURVES IN SPACE AND VECTOR FUNCTIONS 103 curve. A curve is said to be simple if, loosely speaking, it does not intersect itself. To make this notion precise, it is rephrased in terms of parametric curves. A parametric curve r(t) is called simple on a closed interval [a, b] if r(ti) f r(t2) if ti and t2 lie in [a, b] and ti < t2, except possibly if both ti1= a and t2 = b. A simple parametric curve is a parametric curve that is simple on every closed interval [a,b] contained in its domain. A point set C is a simple curve if there is a simple parametric curve whose range is C. A parametric curve is closed if r(a) = r(b). A simple parametric curve is always oriented. EXAMPLE 12.4. Find linear vector functions that orient the straight line segment between r1 = (1, 2, 3) and r2 = (2, 0, 1) from r1 to r2 and from r2 to ri. SOLUTION: The vector r2 - r1 = (1, -2, -2) is parallel to the line segment. So the vector equation r(t) = r1 + t(r2 - ri) describes the line that contains the segment in question. The vector r2-r1 is directed from r1 to r2. Therefore, when t increases from t = 0, the terminal point of r(t) goes along the line from r1 toward r2, reaching the latter at t = 1. Thus, the segment is traversed from r1 to r2 by the vector function r(t)=r1+t(r2-ri)= (1-t, 2-2t, 3-2t), 0 0, it looks like an unwinding spiral (bottom). Right: For t > 0, the curve is traversed by the point moving along the spiral while rising linearly above the xy plane with the distance traveled along the spiral. It can be viewed as a straight line wrapped around the cone x2 + y2 = z2 system of three equations rzi(t) = X2(8) t2 = 8 - 4s r1(t) = r2(s) 1(t) = Y2 (s) e t = 2s. zi(t) = z2(s) t2 +2t- 8=s2 + s- 2 Substituting the second equation t = 2s into the first equation, one finds that (2s)2 = 8-4s whose solutions are s = -2 and s = 1. One has yet to verify that the third equation holds for the pairs (t, s) = (-4, -2) and (t, s) = (2, 1) (otherwise, the z components do not match). A sim- ple calculation shows that indeed both pairs satisfy the equation. So the position vectors of the points of intersection are r1(-4) = r2(-2) (16, -4, 0) and r1(2) = r2(1) = (4, 2, 0). Although the curves along which the particles travel intersect, this does not mean that the par- ticles would necessarily collide because they may not arrive at a point of intersection at the same moment of time, just like two cars traveling along intersecting streets may or may not collide at the street inter- section. The collision condition is more restrictive, r1(t) = r2(t) (i.e., the time t must satisfy three conditions). For the problem at hand, these conditions cannot be fulfilled for any t because, among all the  79. CURVES IN SPACE AND VECTOR FUNCTIONS 107 solutions of r1(t) = r2(s), there is no solution for which t = s. Thus, the particles do not collide. Q Problem 12.6. Find a vector function that traces out the curve of intersection of the paraboloid z = x2 + y2 and the plane 2x + 2y + z = 2 counterclockwise as viewed from the top of the z axis. SOLUTION: One has to find the components x(t), y(t), and z(t) such that they satisfy the equations of the paraboloid and plane simultane- ously for all values of t. This ensures that the endpoint of the vector r(t) remains on both surfaces, that is, traces out their curve of intersection (see Figure 12.5). Consider first the motion in the xy plane. Solving the plane equation for z, z = 2 - 2x - 2y, and substituting the solution into the paraboloid equation, one finds 2 - 2x - 2y = x2 + y2. After completing the squares, this equation becomes 4 = (x + 1)2 + (y + 1)2, which describes a circle of radius 2 centered at (-1, -1). By construc- tion, this circle is the vertical projection of the curve of intersection onto the xy plane (the plane P0 in Figure 12.5). Its parametric equa- tions may be chosen as x = x(t) = -1 + 2 cost, y = y(t) = -1 + 2 sin t. As t increases from 0 to 27, the circle is traced out counterclockwise as required (the clockwise orientation can be obtained, e.g., by re- versing the sign of sin t). The height along the curve of intersection FIGURE 12.5. Illustration to Study Problem 12.6. The curve is an intersection of the paraboloid and the plane P. It is traversed by the point moving counterclockwise about the circle in the xy plane (indicated by Po) and rising so that it remains on the paraboloid.  108 12. VECTOR FUNCTIONS 108 12. VECTOR FUNCTIONS relative to the xy plane is z(t) = 2 - 2x(t) - 2y(t). Thus, r(t) = (-1 + 2 cos t, -1 + 2 sin t, 6 - 2 cos t - 2 sin t), where t E [0,27r]. D Problem 12.7. Let v(t) - vo and u(t) -- 11o as t - to. Prove the limit law for vector functions: limtat,(v(t) - u(t)) = vo -1no using only Definition 12.2. Then prove this law using Theorem 12.1 and basic limit laws for ordinary functions. SOLUTION: The idea is similar to the proof of the basic limit laws for ordinary functions given in Calculus I. One has to find an upper bound for Iv -u- vo -uo l in terms of ||v-vo and 1u -uo ll. By Definition 12.2, the latter quantities converge to 0 as t -- to. The conclusion should follow from the squeeze principle. Consider the identities: v -u - vo uo = (v - vo) 1 u+ vo u - vo uo = (v - vo) - u+ vo - (u - uo) - (v - vo) -(u - uo) + (v - vo) -uo + vo (u - uo). It follows from the inequality 0 |a + b| |a| + |b| and the Cauchy- Schwarz inequality (Theorem 11.2) la - b < a||||b that 0< to t->to t-to - v01u01 + v02u02 + v03u03 = vo 1u0, where the basic limit laws for ordinary functions have been used. D 79.5. Exercises. (1) Find the domain of each of the following vector functions: (iii) r (t) = (v9- t2, ln t, cos t)  79. CURVES IN SPACE AND VECTOR FUNCTIONS 109 (iv) r(t) = (ln(9 - t2), inIt|, (1 + t)/(2 + t)) (v) r (t) = (Vt -1, ln t, 21 -t) (2) Find each of the following limits or show that it does not exist: (i) limtei1(t, 2 - t - t2, 1/(t2 - 2)) (ii) limt-1( Vdt, 2 - t - t2, 1/(t2 _ 1)) (iii) limt-o(et, sint, t/(1 - t)) (iv) limt (e-t, 1/t2, 4) (v) limt (e-t, (1 - t2)/t2, t/(V t+ t)) (vi) limt--_00(2, t2,13t (vii) limt-o+((e2t - 1)/t, (/1 + t - 1)/t, tln t) (viii) limto(sin2(2t)/t2, t2 + 2, (cost - 1)/t2) (ix) limt o((e2t - t)/t, t cot t, 1 + t ) (x) limt (e2t/ cosh2 t, t2012e-t, e-2t sinh2 t) (3) Sketch each of the following curves and identify the direction in which the curve is traced out as the parameter t increases: (i) r(t) = (t, cos(3t), sin(3t)) (ii) r(t) = (2 sin(5t), 4, 3 cos(5t)) (iii) r(t) - (2t sin t, 3t cos t, t) (iv) r(t) = (sin t, cost, ln t) (v) r (t)= (t, 1 - t, (t - 1) 2) (vi) r(t) (t2, t, sin2(7t)) (vii) r(t) (sin t, sin t, v/2 cos t) (4) Two objects are said to collide if they are at the same position at the same time. Two trajectories are said to intersect if they have common points. Let t be the physical time. Let two objects travel along the space curves ri(t) = (t, t2, t3) and r2(t) = (1 + 2t, 1 + 6t, 1 + 14t). Do the objects collide? Do their trajectories intersect? If so, find the collision and intersection points. (5) Find two vector functions that traverse a given curve C in the opposite directions if C is the curve of intersection of two surfaces: (i) y = x2 and z = 1 (ii) x = sin y and z = x (iii) x2 + y2 = 9 and z =xy (iv) x2 + y2 = z2 and x + y + z 1 (v) z = x2 + y2 and y = x2 (vi) x 2/4+ny2/9- =1and cz+ y + z = 1 (vii) xc2/2 + y2/2 + z2/9 =1 and cc - y =0 (viii) 2 + y2 -2x= 0 and z =c2 + y2 (6) Specify the parts of the curve r(t) =(sin t, cos t, 4 sin2 t) that lie above the plane z =1.  110 12. VECTOR FUNCTIONS 110 12. VECTOR FUNCTIONS (7) Find the values of the parameters a and b at which the curve r(t) = (1 + at2, b - t, t3) passes through the point (1, 2, 8). (8) Find the values of a, b, and c, if any, at which each of the following vector functions is continuous: r(O) = (a, b, c) and, for t / 0, (i) r(t) = (t, cos2t, 1 + t + t2) (ii) r(t) = (t, cos2 t, 1 + t2) (iii) r(t) = (t, cos2 t, lnIt|) (iv) r(t) = (sin(2t)/t, sinh(3t)/t, tlnIt1) (v) r(t) = (tcot(2t), t1/31ln ,t2 _ (9) Suppose that the limits limt-a v(t) and limt-a u(t) exist. Prove the basic laws of limits for the following vector functions: lim(v(t) + u(t)) = lim v(t) + limu(t), lim(sv(t)) = s lim v(t), lim(v(t) - u(t)) =_limv(t)X- limu(t), lim(v(t) x u(t)) =lim v(t) x lim u(t). (10) Prove the last limit law in exercise 9 directly from Definition 12.2, that is, without using Theorem 12.1. Hint: See Study Problem 12.7. (11) Let v(t) = ((e2t - 1)/t, (Vl + t - 1)/t, t ln t|), u(t) = (sin2 (2t)/t2, t2 + 2, (cost - 1)/t2 w(t) = (t2/3, 2/(1 -t) 1-+ t -t2 - +t3) Use the basic laws of limits established in Exercise (9) to find (i) limt-o(2v(t) - u(t) + w(t)) (ii) limt-o(v(t) x u(t)) (iii) limt-o(v(t) x u(t)) (iv) limt-o[w(t) - (v(t) x u(t))] (v) limt-o[w(t) x (v(t) x u(t))] (vi) limt-o[w(t) x (v(t) x u(t)) + v(t) x (u(t) x w(t)) + u(t) x (w(t) x v(t))] (12) Suppose that the vector function v(t) x u(t) is continuous. Does this imply that both vector functions v(t) and u(t) are continuous? Support your arguments by examples. (13) Suppose that the vector functions v(t) x u(t) and v(t) are con- tinuous. Does this imply that the vector function u(t) is continuous? Support your arguments by examples.  80. DIFFERENTIATION OF VECTOR FUNCTIONS 111 80. Differentiation of Vector Functions DEFINITION 12.5. (Derivative of a Vector Function). Suppose a vector function r(t) is defined on an interval [a, b] and to E [a, b]. If the limit r(to + h) - r(to) r' dr lim=r'h(to)i (to) exists, then it is called the derivative of a vector function r(t) at t = to, and r(t) is said to be differentiable at to. For to = a or to = b, the limit is understood as the right (h > 0) or left (h < 0) limit, respectively. If the derivative exists for all points in [a, b], then the vector function r(t) is said to be differentiable on [a, b]. It follows from Theorem 12.1 that a vector function is differentiable if and only if all its components are differentiable: , _x(t + h) - x(t) y(t + h) - y(t) z(t + h) - z(t) r h(t) h0xt h - h h (12.1) = (x'(t) , y' (t) , z' (t)) . For example, r(t) = (sin(2t), t2 - t, e-3t) r'(t) =_(2cos(2t), 2t - 1, -3e-3t) DEFINITION 12.6. (Continuously Differentiable Vector Function). If the derivative r'(t) is a continuous vector function on an interval [a, b], then the vector function r(t) is said to be continuously differen- tiable on [a,b]. Higher-order derivatives are defined similarly: the second derivative is the derivative of r'(t), r"(t) = (r'(t))', the third derivative is the derivative of r"(t), r'(t) = (r"(t))', and r()(t) = (r(-1)(t))', provided they exist. 80.1. Differentiation Rules. The following rules of differentiation of vec- tor functions can be deduced from (12.1). THEOREM 12.2. (DifFerentiation Rules). Suppose utt) and v(t) are differentiable vector functions and f(t) is a  112 12. VECTOR FUNCTIONS 112 12. VECTOR FUNCTIONS real-valued differentiable function. Then dt V (t) + U (t) =V' (t) + U' (t), dt .f (t)v(t)] = f'(t)v(t) + f (t)v'(t), dt [v(t) - u(t)] = v'(t) - u(t) + v(t) - u'(t), dt [v(t) x u(t)] = v'(t) x u(t) + v(t) x u'(t), dt vYf(t))]= f'(t) V'(f (t)) The proof is based on a straightforward use of the rule (12.1) and basic rules of differentiation for ordinary functions and left as an exer- cise to the reader. EXAMPLE 12.5. Find the first and second derivatives of the vector function r(t) = (a+ t2b) x (c - td), where a, b, c, and d are constant vectors. SOLUTION: By the product rule, r'(t) = (a + t2b)' x (c - td) + (a + t2b) x (c - td)' = 2tb x (c - td) - (a-+-t2b) x d, r"(t) = (2tb)' x (c - td) + 2tb x (c-td)'- (a + t2b)' x d = 2b x (c - td) - 2tb x d - 2tb x d = 2b x c - 6tb x d. Alternatively, the cross product can be calculated first and then differ- entiated: r(t)= a x c - ta x d + t2b x c - t3b x d, r'(t) = -a x d + 2tb x c - 3t2b x d, r"(t) = 2b x c - 6tb x d. 80.2. Differential of a Vector Function. If r(t) is differentiable, then (12.2) Ar(t) =r(t + At) - r(t) =r'(t) At + u(At) At, where u(At) -~ 0 as At -~ 0. Indeed, by the definition of the de- rivative, u(At) =Ar/At - r'(t) - 0 as At - 0. Therefore, the components of the difference Ar - r' At converge to 0 faster than At.  80. DIFFERENTIATION OF VECTOR FUNCTIONS 113 Suppose that r'(to) does not vanish. Consider a linear vector func- tion L(t) with the property L(to) = r(to). Its general form is L(t) r(to) + v(t - to), where v is a constant vector. For t close to to, L(t) is a linear approximation of r(t) in the sense that the approximation error ||r(t) - L(t) becomes smaller with decreasing |t - tol. It follows from (12.2) that r(t) - L(t) = (r'(to) - v)At + u(At)At, ot = t - to. By the triangle inequality |||a|| - ||b|| ||a + b| < ||a|| + ||bl|, the approximation error is bounded as Ir'(to) - v|| - ||u(At)| < <|r(t)|(t)11 r'(to) - v|| + ||u(At)||. |At| If r'(to) - v| / 0 or v / r'(to), then ||u(At)| « ||r'(to) - v|| for a sufficiently small At because ||u(At)|| converges to 0 as At - 0 (the sign < means "much smaller than"). Therefore, the approximation error decreases linearly with decreasing At: ||r(t) - L(t)| ~||r'(to) - v| |Atl. When v = r'(to), the approximation error decreases faster than At: r(t)L(t)= |u(At)| -0 as At - 0. |At| Thus, the linear vector function L (t) r(to) + r'(to)(t - to) is the best linear approximation of r(t) near t = to. Provided the derivative does not vanish, r'(to) / 0, the linear vector function L(t) defines a line passing through the point r(to). This line is called the tangent line to the curve traced out by r(t) at the point r(to). The analogy can be made with the tangent line to the graph y = f(x) at a point (xo, yo), where yo = f(xo). The equation of the tangent line is y = yo + f'(xo)(x - o) (recall Calculus I). The graph is a curve in the xy plane whose parametric equations are x= t, y = f(t) or in the vector form r(t) = (t, f(t)). The parametric equations of the tangent line can therefore be written in the form x = t = xo + (t - to), y yo + f'(t0)(t - to), where xo = to. Put ro = (xo, yo). Then the tangent line is traversed by the linear vector function L(t) = ro + r'(to)(t - to) because r'(to) = (1, f'(t0)). DEFINITION 12.7. (Differential of a Vector Function). Let r(t) be a differentiable vector function. Then the vector dr(t ) =r'(t ) dt is called the differential of r(t).  114 12. VECTOR FUNCTIONS In particular, the derivative is the ratio of the differentials, r'(t) dr/dt. Recall that the differential dt is an independent variable that describes infinitesimal variations of t such that higher powers of dt can be neglected. In this sense, the definition of the differential is the linearization of (12.2) in dt = At. At any particular t = to, the differential dr(to) = r'(to) dt $ 0 defines the tangent line L(t) = r(to) + dr(to) = r(to) + r'(to) dt, t = to + dt. Thus, the differential dr(t) at a point of the curve r(t) is the increment of the position vector along the line tangent to the curve at that point. 80.3. Geometrical Significance of the Derivative. Consider a vector func- tion that traces out a line parallel to a vector v, r(t) = ro + tv. Then r'(t) = v; that is, the derivative is a vector parallel or tangent to the line. This observation is of a general nature; that is, the vector r'(to) is tangent to the curve traced out by r(t) at the point whose position vector is r(to). Let Po and Ph have position vectors r(to) and r(to + h). Then PoPh = r(to + h) - r(to) is a secant vector. As h -- 0, PoPh approaches a vector that lies on the tangent line as depicted in Figure 12.6. On the other hand, it follows from (12.2) that, for small enough h = dt, PoPh = dr(to) = r'(to)h, and therefore the tangent line is parallel to r'(to). The direction of the tangent vector also defines the orientation + h) FIGURE 12.6. Left: A secant line through two points of the curve, Po and Ph. As h gets smaller, the direction of the vector PoPI = r(to + h) - r(to) becomes closer to the tangent to the curve at Po. Right: The derivative r'(t) defines a tangent vector to the curve at the point with the position vector r(t). It also specifies the direction in which r(t) traverses the curve with increasing t. T(t) is the unit tangent vector.  80. DIFFERENTIATION OF VECTOR FUNCTIONS 115 of the curve, that is, the direction in which the curve is traced out by r(t). EXAMPLE 12.6. Find the line tangent to the curve r(t)= (2t,t2 - 1,t + 2t) at the point Po(2,O,3). SOLUTION: By the geometrical property of the derivative, a vector parallel to the line is v = r'(to), where to is the value of the parameter t at which r(to) = (2, 0, 3) is the position vector of Po. Therefore, to = 1. Then v = r'(1) = (2, 2t, 3t + 2)|t_1 = (2, 2, 5). Parametric equations of the line through Po and parallel to v are x= 2 + 2t, y =2t, z =3+ 5t. D If the derivative r'(t) exists and does not vanish, then, at any point of the curve traced out by r(t), a unit tangent vector can be defined by r'(t) r'(t) In Section 79.3, spatial curves were identified with continuous vector functions. Intuitively, a smooth curve as a point set in space should have a unit tangent vector that is continuous along the curve. Recall also that, for any curve as a point set in space, there are many vector functions whose range coincides with the curve. DEFINITION 12.8. (Smooth Curve). A point set C in space is called a smooth curve if there is a simple, continuously differentiable parametric curve r(t) whose range coincides with C and whose derivative does not vanish. A smooth parametric curve r(t) is oriented by the direction of the unit tangent vector T(t). Note that if r'(t) is continuous and never 0, then T(t) is continuous. In particular, with the definition above, a smooth curve does indeed have a continuous unit tangent vector. Therefore, if a curve does not have a continuous unit tangent vector, it cannot be smooth. This enables us to conclude that some curves are not smooth, based on properties deduced from a single parametrization. This is important because one cannot possibly test all parameteriza- tions to see whether one of them meets the conditions in Definition 12.8. Consider the planar curve r(t) = (t3, t2, 0). The vector function is differentiable everywhere, r'(t) =(2t, 3t2, 0), and the derivative van- ishes at the origin, r' (0) =0. The unit tangent vector T(t) is not defined at t =0. Solving the equation x =t3 for t, t =x1i3, and substituting the latter into y =t2, it is concluded that the curve tra- versed by r(t) is the graph y = x2/3, which has a cusp at x = 0. The  116 12. VECTOR FUNCTIONS 116 12. VECTOR FUNCTIONS curve is not smooth at the origin. The tangent line is the vertical line x = 0 because y'(x) = (2/3)-1/3 - too as x - 0±. The graph lies in the positive half-plane y > 0 and approaches the y axis, form- ing a hornlike shape at the origin. A cusp does not necessarily occur at a point where the derivative r'(t) vanishes. For example, consider r(t) = (t3, t5, 0) such that r'(0) = 0. This vector function traces out the graph y =z5/3, which has no cusp at x = 0 (it has an inflection point at x = 0). There is another vector function R(s) = (s, s5/3, 0) that traces out the same graph, but R'(0) = (1, 0, 0) / 0, and the curve is smooth. So the vanishing of the derivative is merely associated with a poor choice of the vector function. Note that r(t) = R(s) identically if s = t3. By the chain rule, jr(t) = $R(s) = R'(s)(ds/dt). This shows that, even if R'(s) never vanishes, the derivative r'(t) can van- ish, provided ds/dt vanishes at some point, which is indeed the case in the considered example as ds/dt = 3t2 vanishes at t = 0. EXAMPLE 12.7. Determine whether the cycloid C parameterized by z = a(t - sint), y =a(1 -cost) is smooth, where a > 0 is a parameter. If it is not smooth at particular points, investigate its behavior near those points. SOLUTION: Following the remark after Definition 12.8, the existence of a continuous unit tangent vector has to be verified. Let r(t) (z(t),y(t)). Since x'(t) =a(1 - cost) > 0 for all t, and x'(t) = 0 only when t is a multiple of 27, x(t) is monotonically increasing. In particular, x(t) is one-to-one, so C is simple. Since y'(t) = a cos t, the derivatives z'(t) and y'(t) vanish simultaneously if and only if t 27n for some integer n. Thus, r'(t) / 0 unless t = 27n, so C is smooth except possibly at the points r(2n) = (2wna, 0); that is, the portion of C between two consecutive such points is smooth, but it is not yet known whether C is smooth at those points. Since ||r'(t)|| a 2(1 - cost) = a 4sin2(t/2) = 2al sin(t/2)|, the components of the unit tangent vector for t 27n are T1(t) Xz'(t) -'sin(t/2) t, T2(t)- Ix'(t) sint r'(t) r'(t) 2 sin(t/2)| Owing to the periodicity of the sine and cosine functions, it is suffi- cient to investigate the point corresponding to t =0. If there exists a continuous unit tangent vector, then the limit limt-o'i'(t) should exist and be the unit tangent vector at the point corresponding to t =0. By Theorem 12.1, the limits of the components T1(t) and T2(t) should ex- ist as t - 0. Evidently, T1(t) - 0 as t - 0, but the limit limt-o T2(t)  80. DIFFERENTIATION OF VECTOR FUNCTIONS 117 does not exist. Indeed, by l'Hospital's rule the left and right limits are different: lim T2(t) = lim sint = lim cost = 1 t-o+ t-o+ 2 sin(t/2) t-o+ cos(t/2) lim T2(t) = lim sint -lim cost= -1 t--t-o- -2 sin(t/2) t-o+ cos(t/2) Therefore T(t) - (0, 1) as t - 0+, but T(t) - (0, -1) as t - 0-. Thus T(t) cannot be continuously extended across the point (0, 0), so C is not smooth there (as well as at (27rn, 0)), and, in fact, has a cusp there. D A local behavior of the cycloid near (0, 0) may be investigated as follows. Using the Taylor polynomial approximation near t = 0, sin t t - t3/6 and cos t 1 - t2/2, the cycloid is approximated by the curve x = at3/6, y = at2/2. Expressing t = (6x/a)1/3 and substituting it into the other equation, it is concluded that y = cz2/3, where c = (9a/2)1/3. This curve has a cusp at x = 0 as noted above. 80.4. Study Problem. Problem 12.8. Prove that, for any smooth curve on a sphere, a tan- gent vector at any point P is orthogonal to the vector from the sphere center to P. SOLUTION: Let ro be the position vector of the center of a sphere of radius R. The position vector r of any point of the sphere satisfies the equation ||r - roll = R or (r - ro) - (r - ro) = R2 (because ||a||2 = a - a for any vector a). Let r(t) be a vector function that traces out a curve on the sphere. Then, for all values of t, (r(t) - ro) - (r(t) - ro) = R2. Differentiating both sides of the latter relation, one infers r'(t) - (r(t) - ro) = 0 < r'(t) I r(t) - ro. If r(t) is the position vector of P and 0 is the center of the sphere, then OP - r(t) - ro, and hence the tangent vector r'(t) at P is orthogonal to OP for any t or at any point P of the curve. D 80.5. Exercises. (1) Find the derivatives and differentials of each of the following vector functions: (ii) r(t) =(cos t, sin2(t), t2)  118 12. VECTOR FUNCTIONS 118 12. VECTOR FUNCTIONS (iv) r (t) = (-3't - 2, v/~t2 -_4, t) (v) r(t) = a+ bt2 - ceL (vi) r(t) =ta x (b - ce) (2) Sketch the curve traversed by the vector function r(t) = (2, t - 1, t2 + 1). Indicate the direction in which the curve is traversed by r(t) with increasing t. Sketch the position vectors r(0), r(1), r(2) and the vectors r'(0), r'(1), r'(2). Repeat the procedure for the vector function R(t) = r(-t) = (2, -t - 1, t2 + 1) for t = -2, -1, 0. (3) Determine if the curve traced out by each of the following vector functions is smooth for a specified interval of the parameter. If the curve is not smooth at a particular point, graph it near that point. (i) r(t) = (t, t2, t3), 0 < t < 1 (ii) r(t) = (t2, t3, 2),-1 < t < 1 (iii) r(t) = (ti/3, tst),- 1 G t < 1 (iv) r(t) = (t5, t3, t4), -1 G t < 1 (v) r(t) = (sin3t, 1, t2), -7/2 < t < w/2 (4) Find the parametric equations of the tangent line to each of the following curves at a specified point: (i) r(t) = (t2 - t, t3/3, 2t), Po = (6, 9, 6) (ii) r(t) = (lnt, 29/t, t2), Po = (0, 2, 1) (5) Find the unit tangent vector to the curve traversed by the specified vector function at the given point Po: (i) r(t) = (2t + 1, 2 tan-1 t, e-), Po(1, 0, 1) (ii) r(t) = (cos(wt), cos(3wt), sin(ot)), Po(1/2, -1, 1/v/3) (6) Find r'(t) - r"(t) and r'(t) x r"(t) if r(t) = (t, t2 - 1, t3 + 2). (7) Is there a point on the curve r(t) = (t2 - t, t3/3, 2t) at which the tangent line is parallel to the vector v = (-5/2, 2, 1)? If so, find the point. (8) Let r(t) = (et, 2 cost, sin(2t)). Use the best linear approximation L(t) near t = 0 to estimate r(0.2). Use a calculator to assess the accuracy ||r(0.2) - L(0.2)|| of the estimate. Repeat the procedure for r(0.7) and r(1.2). Compare the errors in all three cases. (9) Find the point of intersection of the plane y + z = 3 and the curve r(t) = (lnt, t2, 2t). Find the angle between the normal of the plane and the tangent line to the curve at the point of intersection. (10) Does the curve r(t) =(2t2, 2t, 2-t2) intersect the plane xc+y+z -3? If not, find a point on the curve that is closest to the plane. What is the distance between the curve and the plane. Hint: Express the distance between a point on the curve and the plane as a function of t, then solve the extreme value problem.  80. DIFFERENTIATION OF VECTOR FUNCTIONS 119 (11) Find the point of intersection of two curves ri(t) = (1, 1 - t, 3 + t2) and r2(s) = (3 - s, s - 2, s2). If the angle at which two curves intersect is defined as the angle between their tangent lines at the point of intersection, find the angle at which the above two curves intersect. (12) State the condition under which the tangent lines to the curve r(t) at two distinct points r(ti) and r(t2) are intersecting, or skew, or parallel. Let r(t) = (2sin(7t), cos(7t), sin(7t)), ti = 0, and t2 =1/2. Determine whether the tangent lines at these points are intersecting and, if so, find the point of intersection. (13) Suppose a smooth curve r(t) does not intersect a plane through a point Po and orthogonal to a vector n. What is the angle between n and the tangent line to the curve at the point that is the closest to the plane? (14) Suppose r(t) is twice differentiable. Show that (r(t) x r'(t))' r(t) x r"(t). (15) Suppose that r(t) is differentiable three times. Show that [r (t) - (r'(t) x r"(t))]' = r (t) - (r'(t) x r"'(t)). (16) Let r(t) be a differentiable vector function. Show that ( Ir(t)||)' = r(t) - r'(t)/llr(t)||. (17) A space warship can fire a laser cannon forward along the tangent line to its trajectory. If the trajectory is traversed by the vector func- tion r(t) = (t, t, t2 + 4) in the direction of increasing t and the target is the sphere x2+ y2 + z2 = 1, find the part of the trajectory in which the laser cannon can hit the target. (18) A plane normal to a curve at a point Po is the plane through Po whose normal is tangent to the curve at Po. For each of the following curves, find suitable parametric equations, the tangent line, and the normal plane at a specified point: (i) y =X, z = x2Po (1,1,1) (ii) X2 + z2 = 10, y2 + z2 = 10, Po = (1,1,3) (iii) x2 + y2 + z2 = 6, x + y + z = 0, Po = (1, -2, 1) (19) Show that tangent lines to a circular helix have a constant angle with the axis of the helix. (20) Consider a line through the origin. Any such line sweeps a circular cone when rotated about the z axis and, for this reason, is called a gen- erating line of a cone. Prove that the curve r(t) =(et cos t, et sin t, et) intersects all generating lines of the cone 92+ y2 =z2 at the same angle. Hint: Show that parametric equations of a generating line are z = s cosO6, y =s sinO6, z =s. Define the points of intersection of the line and the curve and find the angle at which they intersect.  120 12. VECTOR FUNCTIONS 120 12. VECTOR FUNCTIONS 81. Integration of Vector Functions DEFINITION 12.9. (Definite Integral of a Vector Function). Let r(t) be defined on the interval [a, b]. The vector whose components are the definite integrals of the corresponding components of r(t) (x(t), y(t), z(t)) is called the definite integral of r(t) over the interval [a, b] and denoted as If the integral (12.3) exists, then r(t) is said to be integrable on [a, b]. By this definition, a vector function is integrable if and only if all its components are integrable functions. Recall that a continuous real- valued function is integrable. Therefore, the following theorem holds. THEOREM 12.3. If a vector function is continuous on the interval [a, b], then it is integrable on [a, b]. EXAMPLE 12.8. Find the integral of r(t) (t/7, sin t, cos t) over the interval [0, 7]. SOLUTION: The components of r(t) are continuous on [0, 7]. Therefore, by the fundamental theorem of calculus, JJ r (t ) dt = t/x) dt,J sin t dt,J cos t dt> =(4 , - cost , sint = (/2, 2,0). 27 o '- oo DEFINITION 12.10. (Indefinite Integral of a Vector Function). A vector function R(t) is called an indefinite integral or an antideriv- ative of r(t) if R'(t) = r(t). If R(t) = (X(t), Y(t), Z(t)) and r(t) = (x(t), y(t), z(t)). Then, ac- cording to (12.1), the functions X(t), Y(t), and Z(t) are antiderivatives of x(t), y(t), and z(t), respectively, X (t) = fx(t) dt+c1 , Y(t) =J y(t) dt+c2 , Z(t) =J z(t) dt+c3 , where ci, c2, and c3 are constants. The latter relations can be combined into a single vector relation: R(t) =fr(t) dt+ c ,  81. INTEGRATION OF VECTOR FUNCTIONS 121 where c is an arbitrary constant vector. Recall that, for a function x(t) continuous on [a, b], its particular antiderivative derivative is given by X(t) = x(u) du, a < t < b. Therefore, a particular antiderivative of a continuous vector function r(t) is R(t) = r(u) du, a < t < b. The vector function R(t) is differentiable on (a, b) and satisfies the condition R(a) = 0. A general antiderivative is obtained by adding a constant vector, R(t) - R(t) + c. This observation allows us to extend the fundamental theorem of calculus to vector functions. THEOREM 12.4. (Fundamental Theorem of Calculus for Vector Functions). If r(t) is continuous on [a, b], then Jb r(t) dt = R(b) - R(a), where R(t) is any antiderivative of r(t), that is, a vector function such that R'(t) = r(t). EXAMPLE 12.9. Find r(t) if r'(t) = (2t, 1, 6t2) and r(1) = (2, 1, 0). SoLUTIoN: Taking the antiderivative of r'(t), one finds r(t) =(2t, 1, 6t2) dt + c = (t2, t, 3t3) + c. The constant vector c is determined by the condition r(1) = (2, 1, 0), which gives (1,1, 3) + c = (2,1, 0). Hence, c = (2, 1, 0) - (1,1, 3) (1,0, -3) and r(t) = (t2 + 1, t, 3t3 - 3). D In general, the solution of the equation r'(t) = v(t) satisfying the condition r(to) = ro can be written in the form r'(t) =v(t) and r(to) =r0 -> r (t) =ro + fv(u) dui if v(t) is a continuous vector function. As noted above, if the integrand is a continuous function, then the derivative of the integral with respect to its upper limit is the value of the integrand at that limit. Therefore, r'(t) =(d/dt) f v(u) du = v(t), and hence r(t) is an antiderivative of v(t). When t =to, the integral vanishes and r(to) =ro as required.  122 12. VECTOR FUNCTIONS 122 12. VECTOR FUNCTIONS 81.1. Applications to Mechanics. Let r(t) be the position vector of a particle as a function of time t. The first derivative r'(t) = v(t) is called the velocity of the particle. The magnitude of the velocity vector v(t) = ||v(t)|| is called the speed. The speed of a car is a number shown on the speedometer. The velocity defines the direction in which the particle travels and the instantaneous rate at which it moves in that direction. The second derivative r"(t) = v'(t) = a(t) is called the acceleration. If m is the mass of a particle and F is the force acting on the particle, according to Newton's second law, the acceleration and force are related as F =ma. If the time is measured in seconds, the length in meters, and the mass in kilograms, then the force is given in newtons, 1 N = 1 kg. -m/s2. If the force is known as a vector function of time, then Newton's second law determines a particle's trajectory. The problem of find- ing the trajectory amounts to reconstructing the vector function r(t) if its second derivative r"(t) = (1/m)F(t) is known; that is, r(t) is given by the second antiderivative of (1/m)F(t). Indeed, the velocity v(t) is an antiderivative of (1/m)F(t), and the position vector r(t) is an antiderivative of the velocity v(t). As shown in the previous sec- tion, an antiderivative is not unique, unless its value at a particular point is specified. So the trajectory of motion is uniquely determined by Newton's equation, provided the position and velocity vectors are specified at a particular moment of time, for example, r(to) = ro and v(to) = vo. The latter conditions are called initial conditions. Given the initial conditions, the trajectory of motion is uniquely defined by the relations: v(t) = vo +i-f F(u) du, r(t) ro +fv(u) du to to if the force is a continuous vector function of time. Remark. If the force is a function of a particle's position, then New- ton's equation becomes a system of ordinary differential equations, that is, a set of some relations between components of the vector functions, its derivatives, and time. EXAMPLE 12.10. (Motion Under a Constant Force). Prove that the trajectory of motion under a constant force is a parabola if the initial velocity is not parallel to the force.  81. INTEGRATION OF VECTOR FUNCTIONS 123 SOLUTION: Let F be a constant force. Without loss of generality, the initial conditions can be set at t = 0, r(0) = ro, and v(0) = vo. Then 1if t v(t)=vo+ Fdu-vo+-F, m o r(t) =ro+ v(u)du~ro+tvo+ F. ii]n2m If the vectors vo and F are parallel, then they are proportional, vo = cF. In this particular case, the trajectory r(t) = ro+(ct+t2/(2rn))F = ro + sF lies in the straight line through ro and parallel to F. The parameter s = ct + t2/(2m) defines the position of the particle on the line as a function of time. Otherwise, the vector r(t) - ro is a linear combination of two nonparallel vectors vo and F and hence must be orthogonal to n = vo x F by the geometrical property of the cross product. Therefore, the particle remains in the plane through ro that is parallel to F and vo or orthogonal to n, that is, (r(t) - ro)-n = 0 (see Figure 12.7, left panel). The shape of a space curve does not depend on the choice of the coordinate system. Let us choose the coordinate system such that the origin is at the initial position ro and the plane in which the trajectory lies coincides with the zy plane so that F is parallel to the z axis. In this coordinate system, ro = 0, F = (0, 0, -F), and vo = (0, voy, voz). The parametric equations of the trajectory of motion assume the form x = 0, y = voyt, and z = vozt -t2F/(2m). The substitution of t = y/voy into the latter equation yields z = ay2 + by, where a = -Fvoy/(2m) and b = voz/voy, which defines a parabola in the zy plane. Thus, the trajectory of motion under a constant force is a parabola through the point ro that lies in the plane containing the force and initial velocity vectors F and vo. The parabola is concave in the direction of the force. In Figure 12.7, the force vector points downward and the trajectory is concave downward. D 81.2. Motion Under a Constant Gravitational Force. The magnitude of the gravitational force that acts on an object of mass m near the surface of the Earth is mg, where g 9.8 m/s2 is a universal constant called the acceleration of a free fall. According to Example 12.10, any projectile fired from some point follows a parabolic trajectory. This fact allows one to predict the exact positions of the projectile and, in particular, the point at which it impacts the ground. In practice, the initial speed vo of the projectile and angle of elevation 0 at which the projectile is fired are known (see Figure 12.7, right panel). Some practical questions are: At what elevation angle is the maximal range reached? At what  124 12. VECTOR FUNCTIONS 124 12. VECTOR FUNCTIONS h n = vo x F X FIGURE 12.7. Left: Motion under a constant force F. The trajectory is a parabola that lies in the plane through the initial point of the motion ro and orthogonal to the vector n = vo x F, where the initial velocity vo is assumed to be nonparallel to the force F. Right: Motion of a projectile thrown at an angle 0 and an initial height h. The trajectory is a parabola. The point of impact defines the range L(0). elevation angle does the range attain a specified value (e.g., to hit a target)? To answer these and related questions, choose the coordinate sys- tem such that the z axis is directed upward from the ground and the parabolic trajectory lies in the zy plane. The projectile is fired from the point (0, 0, h), where h is the initial elevation of the projectile above the ground (firing from a hill). In the notation of Example 12.10, F = -mg (F is negative because the gravitational force is directed toward the ground, while the z axis points upward), voy = vo cos 0, and voz = vo sin 0. The trajectory is y = tvo cos 0 , z = h+tvo sin0 - -gt2 , t > 0. 2 It is interesting to note that the trajectory is independent of the mass of the projectile. Light and heavy projectiles would follow the same parabolic trajectory, provided they are fired from the same position, at the same speed, and at the same angle of elevation. The height of the projectile relative to the ground is given by z(t). The horizontal displacement is y(t). Let tL > 0 be the moment of time when the projectile lands; that is, when t = tL, the height vanishes, z(tL) 0- A positive solution of this equation is vo sin 0 + Vvo sin20+2gh tL g  81. INTEGRATION OF VECTOR FUNCTIONS 125 The distance L traveled by the projectile in the horizontal direction until it lands is the range: L - y(tL) = tLvocosOB . For example, if the projectile is fired from the ground, h = 0, then tL = 2vo sin O/g and the range is L = o sin(20)/g. The range attains its maximal value vo/g when the projectile is fired at an angle of elevation 0 = r/4. The angle of elevation at which the projectile hits a target at a given range L = Lo is 0 = (1/2) sin-1(Log/vo). For h / 0, the angle at which L = L(O) attains its maximal values can be found by solving the equation L'(O) = 0, which defines critical points of the function L(O). The angle of elevation at which the projectile hits a target at a given range is found by solving the equation L(O) = Lo. The technicalities are left to the reader. Remark. In reality, the trajectory of a projectile deviates from a parabola because there is an additional force acting on a projectile moving in the atmosphere, the friction force. The friction force de- pends on the velocity of the projectile. So a more accurate analysis of the projectile motion in the atmosphere requires methods of ordinary differential equations. 81.3. Study Problems. Problem 12.9. The acceleration of a particle is a = (2, 6t, 0). Find the position vector of the particle and its velocity in 2 units of time t if the particle was initially at the point (-1, -4,1) and had the velocity (0,2,1). SOLUTION: The velocity vector is v(t) = f a(t) dt +c = (2t, 3t2, 0) +c. The constant vector c is fixed by the initial condition v(0) = (0, 2, 1), which yields c = (0, 2, 1). Thus, v(t) = (2t, 3t2 + 2, 1) and v(2) (4, 14, 1). The position vector is r(t) = f v(t) dt+c = (t2, t3+2t, t)+c. Here the constant vector c is determined by the initial condition r(0) = (-1, -4, 1), which yields c = (-1, -4, 1). Thus, r(t) = (t2 - 1, t3 + 2t-4,t+1) and r(2) = (3,8,3). D Problem 12.10. Show that if the velocity and position vectors of a particle remain orthogonal during the motion, then the trajectory lies on a sphere. SOLUTION: If v(t) =r'(t) and r(t) are orthogonal, then r'(t) - r(t) =0 for all t. Since (r . r)' =r' .r + r .r' =2r' .r =0, one concludes that r(t) - r(t) = 2=const or |r(t)|= R for all t; that is, the particle remains at a fixed distance R from the origin all the time. D  126 12. VECTOR FUNCTIONS 126 12. VECTOR FUNCTIONS Problem 12.11. A charged particle moving in a magnetic field B is subject to the Lorentz force F = (e/c)v x B, where e is the electric charge of the particle and c is the speed of light in vacuum. Assume that the magnetic field is a constant vector parallel to the z axis and the initial velocity is v(0) = (vi, 0, v1 ). Show that the trajectory is a helix: eB v r(t) (Rsin(wt), Rcos(wt), viit), w = c, R w where B B is the magnitude of the magnetic field and m is the particle mass. SOLUTION: Newton's second law reads mv' -vxB. c Put B = (0, 0, B). Then v = r' = (wR cos(wt), -wR sin(wt),vii), v x B = (-wRB sin(wt), -wRB cos(wt), 0), v' = (-w2R sin(wt), -w2R cos(wt), 0). The substitution of these relations into Newton's second law yields mw2R = eBRw/c and hence w= (eB)/(mc). Since v(0) = (wR, 0, vii) (vi, 0, vii), it follows that R = v1/w. D Remark. The rate at which the helix rises along the magnetic field is determined by the magnitude (speed) of the initial velocity compo- nent vil parallel to the magnetic field, whereas the radius of the helix is determined by the magnitude of the initial velocity component v1 perpendicular to the magnetic field. A particle makes one full turn about the magnetic field in time T = 2w/w = 2wmc/(eB); that is, the larger the magnetic field, the faster the particle rotates about it. 81.4. Exercises. (1) Find the indefinite and definite integrals over specified intervals for each of the following functions: (i) r(t) = (1, 2t, 3t2), 0 < t < 2 (ii) r(t) = (sin t, t3, cost), -7 < t < w (iii) r(t) = (t2, tv/t- 1, /t), 0 G t < 1 (iv) r(t) =(tlnt, t2, e2), 0 < t < 1 (v) r (t) =(2 sin tcos t, 3 sin tcos2 t, 3 sin2 tcos t), 0 t w/2 (vi) r(t) =a+ cos(t)b, 0 % t < w (vii) r(t) =a x (u'(t) +b), 0<% t < 1 if u(0) =a and u(1) =a- b (2) Find r(t) if the derivatives r'(t) and r(to) are given:  81. INTEGRATION OF VECTOR FUNCTIONS 127 (i) r'(t) = (1, 2t, 3t2), r(0) (1, 2, 3) (ii) r'(t) = (t - 1, t2, /1t), r (1) = (1, 0, 1) (iii) r'(t) = (sin(2t), 2 cost, sin2 t), r(7) (1, 2, 3) (3) Find r(t) if (i) r"(t) = (0, 2, 6t), r(0) = (1,2,3), r'(0) = (1, 0, -1) (ii) r"(t) = (ti/3, ti/2, 6t), r(1) = (1, 0, -1), r'(0) = (1, 2, 0) (iii) r"(t) = (- sin t, cos t, 1/t), r (7r) = (1, -1, 0), r'(7r) = (-1, 0, 2) (iv) r"(t) = (0, 2, 6t), r(0) = (1, 2, 3), r(1) = (1,0,-i) (4) Solve the equation r"(t) = a, where a is a constant vector if r(0) = b and r(to) = c for some t = to 0. (5) Find the most general vector function whose nth derivative van- ishes, r(n)(t) = 0. (6) Show that a continuously differentiable vector function r(t) satis- fying the equation r'(t) x r(t) = 0 traverses a straight line (or a part of it). (7) If a particle was initially at point (1, 2, 1) and had velocity v - (0, 1, -1), find the position vector of the particle after it has been mov- ing with acceleration a(t) - (1, 0, t) for 2 units of time. (8) A particle of unit mass moves under a constant force F. If a parti- cle was initially at the point ro and passed through the point r1 after 2 units of time, find the initial velocity of the particle. What was the velocity of the particle when it passed through ri? (9) A particle of mass 1 kg was initially at rest. Then during 2 seconds a constant force of magnitude 3 N was applied to the particle in the direction of (1, 2, 2). How far is the particle from its initial position in 4 seconds? (10) The position vector of a particle is r(t) = (t2, 5t, t2 - 16t). Find r(t) when the speed of the particle is maximal. (11) A projectile is fired at an initial speed of 400 m/s and at an angle of elevation of 300. Find the range of the projectile, the maximum height reached, and the speed at impact. (12) A ball of mass m is thrown southward into the air at an initial speed of vo at an angle of 0 to the ground. An east wind applies a steady force of magnitude F to the ball in a westerly direction. Find the trajectory of the ball. Where does the ball land and at what speed? Find the deviation of the impact point from the impact point A when no wind is present. Is there any way to correct the direction in which the ball is thrown so that the ball still hits A? (13) A rocket burns its onboard fuel while moving through space. Let v(t) and mn(t) be the velocity and mass of the rocket at time t. It can be shown that the force exerted by the rocket jet engines is mn'(t)vg,  128 12. VECTOR FUNCTIONS 128 12. VECTOR FUNCTIONS where vg is the velocity of the exhaust gases relative to the rocket. Show that v(t) = v(0) - ln(m(0)/m(t))v. The rocket is to accelerate in a straight line from rest to twice the speed of its own exhaust gases. What fraction of its initial mass would the rocket have to burn as fuel? (14) The acceleration of a projectile is a(t) = (0, 2, 6t). The projectile is shot from (0, 0, 0) with an initial velocity v(0) =_(1, -2, -10). It is supposed to destroy a target located at (2, 0, -12). The target can be destroyed if the projectile's speed is at least 3.1 at impact. Will the target be destroyed? 82. Arc Length of a Curve Let a vector function r(t), a < t < b, traverse a space curve C. Consider a partition of the interval [a, b], a = to < ti < t2 <"""c tN-1 < tN = b. This partition induces a partition of the curve, which is a collection of points of C, Pk, k = 0, 1, ..., N, whose position vectors are r(te). In particular, Po and PN are the endpoints of the curve (see Figure 12.8, left panel). Let DN = maxk(tk - tk-1) be the maximal length among all the partition intervals. A partition is said to be refined if DN' N. Under a refinement of a partition, DN -- 0 as N - oo. A refinement is obtained by adding a partition point in each partition interval whose length is DN (at least one such interval is always present). DEFINITION 12.11. (Arc Length of a Curve). Let r(t), a < t < b, be a vector function traversing a curve C. Let a collection of points Pk be a partition of C, k = 0,1, ..., N, and let |Pk_1Pk| be the distance between two neighboring partition points. The arc length of a curve C is the limit N L = lim ( |Pa_1Po| k=1 where the partition is refined as N - o, provided it exists and is independent of the choice of partition. If L < oc, the curve is called measurable or rectifiable. The geometrical meaning of this definition is rather simple. Here the sum of |Pk_1Pkl is the length of a polygonal path with vertices at F0, F1,..., PN in this order. As the partition becomes finer and finer, this polygonal path approaches the curve more and more closely (see Figure 12.8, left panel). In certain cases, the arc length is given by the Riemann integral.  82. ARC LENGTH OF A CURVE 129 z B Pk Pk+1 s = s(t P rk rk+1 PN r(t) = R(s) rN Po ro > x FIGURE 12.8. Left: The arc length of a curve is defined as the limit of the sequence of lengths of polygonal paths through partition points of the curve. Right: Natural pa- rameterization of a curve. Given a point A of the curve, the arc length s is counted from it to any point P of the curve. The position vector of P is a vector R(s). If the curve is traced out by another vector function r(t), then there is a relation s = s(t) such that r(t) = R(s(t)). THEOREM 12.5. (Arc Length of a Curve). Let C be a curve traced out by a continuously differentiable vector func- tion r(t), which defines a one-to-one correspondence between points of C and the interval t E [a, b]. Then L = r'(t) dt . PROOF. Owing to the one-to-one correspondence between [a, b] and C, given a partition tk of [a, b] such that to = a < t1 < ... < tN-1 < tN = b, there is a unique polygonal path with vertices Pk on C whose length is N N |Pk-1Pk S rk - rk_1||. k=1 k=1 where rk - r(tk). Put Atk - tk - tk_1 > 0, k = 1, 2, ..., N. Under a refinement of the partition, DN maxk Atk - 0 as N -- oc and therefore Atk - 0 for all k as N - oc. Let r'_1 = r'(tk_1). The differentiability of r(t) implies that rk - rk r'_1 Atk + uk Atk,  130 12. VECTOR FUNCTIONS 130 12. VECTOR FUNCTIONS where u1k - 0 as atk - 0 for every k (cf. (12.2)). By the triangle inequality (11.7), ||r'g1||stk - ||uk||Atk ||r/ - rk_1| ||r' tk _1|| t + |ku|| tk. The lower and upper bounds for the length of the polygonal path are obtained by taking the sum over k in this inequality. Next, it is shown that these bounds converge to the Riemann integral of ||r'(t)|| over [a, b], and the assertion follows from the squeeze principle. By the continuity of the derivative, the function ||r'(t) is continu- ous and hence integrable. Therefore, its Riemann sum converges: N b Z r'g1 Ztk||t /||r'(t)| dt as N noo. k=1 Ja Put maxk | k MN (the largest ||ukJ| for a given partition size N). Then N N Z 11ku|Atk L. EXAMPLE 12.11. Find the arc length of the curve r(t) = (t2, 2t, ln t), 1 0. It is positive because r'(t) / 0 for a smooth curve. The existence of the inverse function s(t) is guaranteed by the inverse function theorem proved in Calculus I: THEOREM 12.6. (Inverse Function Theorem). Let s(t), a < t < b, have a continuous derivative such that s'(t) > 0 for a < t < b. Then there exists an inverse differentiable function t = t(s), c < s < d, and t'(s) = 1/s'(t), where t = t(s) on the right side. Thus, the condition s'(t) = ||r'(t)|| > 0 guarantees the existence of a one-to-one correspondence between the variables s and t and the existence of the differentiable inverse function t = t(s). Let r(t) (x(t), y(t), z(t)) be parametric equations of a smooth curve C. Then the parametric equations of C in the natural parameterization have the form R(s) = (x(t(s)), y(t(s)), z(t(s))). EXAMPLE 12.13. Reparameterize the helix from Example 12.12, r(t) = (R cos t, R sin t, th/(2w)), with respect to the arc length measured from the point (R, 0, 0) in the direction of increasing t. SOLUTION: The point (R, 0, 0) corresponds to t = 0. Then s~t)= t|r'(u)|du =du=2 -> t(s)= , where L = /(27R)2 + h2 is the are length of one turn of the helix (see Example 12.12). Therefore, R(s) = r(t(s)) = (Rcos(2ws/L), Rsin(2ws/L), hs/L) In particular, R(0) = (R, 0, 0) and R(L) = (R, 0, h) are the position vectors of the endpoints of one turn of the helix as required. Q EXAMPLE 12.14. Find the coordinates of a point P that is 57/3 units of length away from the point (4, 0, 0) along the helix r(t) (4 cos(wt), 4 sin(7t), 37t). SOLUTION: If R(s) is the natural parameterization of the helix where s is counted from the point (4, 0, 0), then the position vector of the point in question is given by R(5w/3). Thus, the first task is to find R(s). One has r'(u) =(-4w sin(wu), 4wcos(wu), 3w) -> |r'(u)|= 5w.  134 12. VECTOR FUNCTIONS 134 12. VECTOR FUNCTIONS The initial point of the helix corresponds to t = 0. So the are length counted from (4, 0, 0) as a function of t is s(t) f= fr'(u)| du = t 5xdu= 5t -> t(s) = . The natural parameterization reads R(s) = r(t(s)) = (4cos(s/5), 4 sin(s/5), 3s/5). The position vector of P is R(57/3) = (2, 2v3, 7). However, this is not a complete answer to the problem because there are two points of the helix at the specified distance from (4, 0, 0). One such point is upward along the helix, and the other is downward along it. Note that s(t) defined above is the are length parameter counted from (4, 0, 0) in the direction of increasing t (upward along the helix, t > 0). Accordingly, s(t) can be counted in the direction of decreasing t (downward along the helix, t < 0). In this case, s(t) = -57t > 0. Hence, the position vector of the other point is R(-57/3) = (2, -2v/3, -7). Q It follows from Theorem 12.6 that the derivative of a vector function that traverses a smooth curve C with respect to the natural parameter, the arc length, is a unit tangent vector to the curve. Indeed, by the chain rule applied to the components of the vector function: dr(t) /dx(t) dy(t) dx(t)( ds \~~ ,s d d (xc'(t)t'(s), y'(t)t'(s), z'(t)t'(s)) ds d s d 1 1 = t'(s) (x'(t), y'(t), z'(t)) = , r'(t) = ,r'(t) = 'i(t). Thus, for a natural parameterization r(s) of a smooth curve C, the derivative r'(s) is a unit tangent vector to C, ||r'(s)|| = 1. By definition, the are length is independent of a parameterization of a space curve. For smooth curves, this can also be established through the change of variables in the integral that determines the are length. Indeed, let r(t), t E [a, b], be a one-to-one continuously differentiable vector function that traces out a curve C of length L. Consider the change of the integration variable t = t(s), s E [0, L]. Then, by the inverse function theorem, s'(t) =_||r'(t) and ds = s'(t) dt = ||r'(t) dt. Thus, L = |r'(t)| dt =fds for any parameterization of the curve C.  82. ARC LENGTH OF A CURVE 135 82.3. Exercises. (1) Find the are length of each of the following curves: (i) r(t) = (3 cos t, 2t, 3 sin t), -2 < t < 2 (ii) r(t) = (2t, t3/3, t2), 0 < t < 1 (iii) r(t) = (3t2, 4t3/2, 3t), 0 < t < 2 (iv) r(t) = (et, v2 t, e-t), -1 < t < 1 (v) r(t) = (cosh t, sinh t, t), 0 < t < 1 (vi) r(t) = (cost + t sint, sint + t cos t,t2), 0 G t < 2, Hint: Find the decomposition r(t) = v(t) - tw(t) + t253, where v, w, and e3 are mutually orthogonal and v'(t) = w(t), w'(t) - -v(t). Use the Pythagorean theorem to calculate ||r'(t)||. (2) Find the are length of the curve r(t) = (e-t cos t, e-t sin t, e-t), 0 < t < oc. Hint: Put r(t) = e-tu(t), differentiate, show that u(t) is orthogonal to u'(t), and use the Pythagorean theorem to calculate Ir'(t)|. (3) Find the are length of the portion of the helix r(t) = (cos t, sin t, t) that lies inside the sphere x2+ y2 + z2 = 2. (4) Find the are length of the portion of the curve r(t) (2t, 3t2, 3t3) that lies between the planes z = 3 and z = 24. (5) Find the are length of the portion of the curve r(t) (lnt, t2, 2t) that lies between the points of intersection of the curve with the plane y - 2z+3 =0. (6) Let C be the curve of intersection of the surfaces z2 = 2y and 3x = yz. Find the length of C from the origin to the point (36, 18, 6). (7) For each of the following curves defined by the given equations with a parameter a, find suitable parametric equations and evaluate the are length between a given point A and and a generic point B = (zo, Yo, zo): (i) y = a sin-1(c/a), z = (a/4) ln[(a - x)/(a + z)], A = (0, 0, 0) (ii) (x - y)2 = a(x + y), x2 -2 = 9z2/8, A = (0, 0, 0) Hint: Use the new variables = x + y and v = x - y to find the parametric equations. (iii) x2 + y2 = az, y/c = tan(z/a), A = (0, 0, 0) Hint: Use the polar coordinates in the zy plane to find the parametric equations. (iv) x2 + y2 + z2 = a2, c2 + y2 cosh(tan-1(y/)) = a, A = (a, 0, 0) Hint: Represent the second equation as a polar graph. (8) Reparameterize each of the following curves with respect to the arc length measure from the point where t =0 in the direction of increasing t: (i) r =(t, 1 - 2t, 5 + 3t) (ii) r =24iie1+ (n< - 1)e3  136 12. VECTOR FUNCTIONS 136 12. VECTOR FUNCTIONS (iii) r(t) = (cosh t, sinh t, t) (iv) x= a(t - sin t), y =a(1 - cost), a > 0 (9) A particle travels along a helix of radius R that rises h units of length per turn. Let the z axis be the symmetry axis of the helix. If a particle travels the distance 47R from the point (R, 0, 0), find the position vector of the particle. (10) A particle travels along a curve traversed by the vector function r(u) = (u, coshtu, sinhu) from the point (0, 1, 0) with a constant speed 2 m/s so that its x coordinate increases. Find the position of the particle in 1 second. (11) Let C be a smooth closed curve whose are length is L. Let r(t) be a vector function that traverses C only once for a < t < b. Prove that there is a number a < t* < b such that ||r'(t*)||I= L/(b - a). Hint: Recall the integral mean value theorem. (12) A particle travels in space a distance D in time T. Show that there is a moment of time 0 < t a2(t) + b2(t) =po by the Pythagorean theorem. Parametric equations of the circle can be taken in the form a(t) = -po cos t and b(t) = po sin t. The vector function that traces out the osculating circle is R(t) =ro + po (1 - cos t) N0 + po sint To, where t E [0, 27r]. The above choice of a(t) and b(t) has been made so that R(0) =r0.D  83. CURVATURE OF A SPACE CURVE 145 Problem 12.15. Consider a helix r(t) = (R cos(wt), R sin(wt), ht), where w and h are numerical parameters. The arc length of one turn of the helix is a function of the parameter w, L = L(w), and the curvature at any fixed point of the helix is also a function of w, K =im(w). Use only geometrical arguments (no calculus) to find the limits of L(w) and I(w) as w - 0o. SOLUTION: The vector function r(t) traces out one turn of the helix when t ranges over the period of cos(wt) or sin(wt) (i.e., over the interval of length 27/w). Thus, the helix rises by 2wh/w = H(w) along the z axis per each turn. When w - oc, the height H(w) tends to 0 so that each turn of the helix becomes closer and closer to a circle of radius R. Therefore, L(w) - 27R (the circumference) and mo(w) - 1/R (the curvature of the circle) as w - oo. A calculus approach requires a lot more work to establish this result: /n27/"' 2x L(w) = ||r'(t)|| dt (Rw)2 + h2 0w - 2w /R2 + (h/w)2 - 2wR, |r'(t) x r"(t)|| Rw2[(Rw)2 + h21/2 |r'(t)||3 [(Rw)2 + h23/2 R 1 R2 + (h/w)2 R as w - oo. 83.3. Exercises. (1) Find the curvature of each of the following curves as a function of the parameter and the curvature radius at a specified point P: (i) r (t) = (it, 1 - t, t2 + 1), P(1, 0, 2) (ii) r (t) = (t2, t, 1) , P(4, 2, 1) (iii) y = sin(x/2), P(7, 1) (iv) r(t) - (4t3/2, -t2, t), P(4, -1,1) (v) x = 1 + t2, y = 2 + t3 P(21 (vi) x = et cos t, y = 0, z = e sin t, P(1, 0, 0) (vii) r (t)= (ln t, v/t, t2), P(0, 1, 1) (viii) r(t)= (cosht, sinht, 2 + t), P(1, 0,2) (ix) r (t) =(et, v2 t, e--), P(1, 0,1) (x) r(t) =(sin t - t cos t, t2, cos t + t sin t), P(0, 0, 1) (2) Find the curvature of r(t) =(t, t2/2, ts/3) at the point of its inter- section with the surface z =2xcy + 1/3.  146 12. VECTOR FUNCTIONS 146 12. VECTOR FUNCTIONS (3) Find the maximal and minimal curvatures of the graph y = cos(ax) and the points at which they occur. Sketch the graph for a = 1 and mark the points of the maximal and minimal curvatures, local maxima and minima of cos x, and the inflection points. (4) Use a geometrical interpretation of the curvature to guess the point on the graphs y =ax2 and y = az4 where the maximal curvature oc- curs. Then verify your guess by calculations. (5) Let f(x) be a twice continuously differentiable function and let s(x) be the curvature of the graph y = f(x). (i) Does i attain a local maximum value at every local minimum and maximum of f? If not, state an additional condition on f under which the answer to this question is affirmative. (ii) Prove that K =0 at inflection points of the graph. (iii) Show by an example that the converse is not true, that is, that the curvature vanishes at x = zo does not imply that the point (zo, f(zo)) is an inflection point. (6) Let f be twice differentiable at zo. Let T2(x) be its Taylor polyno- mial of the second order about x = zo. Compare the curvatures of the graphs y= f(x) and y = T2(x) at x = zo. (7) Find the equation of the osculating circles to the parabola y = x2 at the points (0, 0) and (1, 1). (8) Find the maximal and minimal curvature of the ellipse x2/a2 + y2/b2 = 1, a > b, and the points where they occur. Give the equations of the osculating circles at these points. (9) Let r(t) = (t3, t2, 0). This curve is not smooth and has a cusp at t = 0. Find the curvature for t / 0 and investigate its limit as t - 0. (10) Find an equation of the osculating plane for each of the following curves at a specified point: (i) r(t) =(4ts/2, -t2 ) p(4 _ 1 (ii) r(t)= (ln t, v/t-, t2), P(0, 1, 1) (11) Find an equation for the osculating and normal planes for the curve r(t) - (ln(t) , 2t, t2) at the point Po of its intersection with the plane y - z = 1. A plane is normal to a curve at a point if the tangent to the curve at that point is normal to the plane. (12) Is there a point on the curve r(t) = (t, t2, t3) where the osculat- ing plane is parallel to the plane 3x + 3y + z =1? (13) Prove that the trajectory of a particle has a constant curvature if the particle moves so that the magnitudes of its velocity and accel- eration vectors are constant. (14) Consider a graph y =f(xt) such that f"(zto) / 0. At a point  84. APPLICATIONS TO MECHANICS AND GEOMETRY 147 (zo, Yo) on the curve, where yo = f(zo), find the equation of the oscu- lating circle in the form (x - a)2 + (y - b)2 = R2. Hint: Show first that the vector (1, f'(zo)) is tangent to the graph and a vector orthogonal to it is (-f'(xo), 1). Then consider two cases f"(zo) > 0 and f"(xo) < 0. (15) Find the osculating circle for the cycloid x = a(t - sint), y a(1 - cost) at the point t =w7/2. (16) Let a smooth curve r = r(t) be planar and lie in the xy plane. At a point (zo, yo) on the curve, find the equation of the osculating circle in the form (x - a)2 + (y - b)2 = R2. Hint: Use the result of Study Problem 12.14 to express the constants a, b, and R via zo, yo, and the curvature at (zo, Yo). 84. Applications to Mechanics and Geometry 84.1. Tangential and Normal Accelerations. Let r(t) be the trajectory of a particle (t is time). Then v(t) = r'(t) and a(t) = v'(t) are the velocity and acceleration of the particle. The magnitude of the velocity vector is the speed, v(t) = ||v(t)||. If T(t) is the unit tangent vector to the trajectory, then T'(t) is orthogonal to it. The unit vector N(t) T'(t)/lT'(t) is called a unit normal to the trajectory. In particular, the osculating plane at any point of the trajectory contains T(t) and N(t). The differentiation of the relation v(t) = v(t)T(t) (see (12.6)) shows that that acceleration always lies in the osculating plane: a = v'T+vT' = v'T+v|T'|N. Furthermore, substituting the relations s = ||T'll/v and p = 1/K into the latter equation, one finds (see Figure 12.11, left panel) that a = aTT+ aNN, aT=v'=a= v v 2 aN v2 - V 1vxal p v DEFINITION 12.17. (Tangential and Normal Accelerations). Scalar projections aT and aN of the acceleration vector onto the unit tangent and normal vectors at any point of the trajectory of motion are called tangential and normal accelerations, respectively. The tangential acceleration aT determines the rate of change of a particle's speed, while the normal acceleration appears only when the particle makes a "turn." In particular, a circular motion with a constant speed, v vo, has no tangential acceleration, aT 0, and  148 12. VECTOR FUNCTIONS 148 12. VECTOR FUNCTIONS v FIGURE 12.11. Left: Decomposition of the acceleration a of a particle into normal and tangential components. The tangential component aT is the scalar projection of a onto the unit tangent vector T. The normal component is the scalar projection of a onto the unit normal vector N. The vectors r and v are the position and velocity vectors of the particle. Right: The tangent, normal, and binormal vectors associated with a smooth curve. These vectors are mutually orthogonal and have unit length. The binormal is defined by B = T x N. The shape of the curve is uniquely determined by the orientation of the triple of vectors T, N, and B as functions of the arc length parameter up to general rigid rotations and translations of the curve as the whole. the normal acceleration is constant, aN = vo/R, where R is the circle radius. Indeed, taking the derivative of the relation v . v = vo, it is concluded that v'. v = 0 or a. v = 0 or aT = 0. Since the curvature of a circle is the reciprocal of its radius, aN 2 To gain an intuitive understanding of the tangential and normal accelerations, consider a car moving along a road. The speed of the car can be changed by pressing the gas or brake pedals. When one of these pedals is suddenly pressed, one can feel a force along the direction of motion of the car (the tangential direction). The car speedometer also shows that the speed changes, indicating that this force is due to the ac- celeration along the road (i.e., the tangential acceleration aT = U' 0). When the car moves along a straight road with a constant speed, its acceleration is 0. When the road takes a turn, the steering wheel must be turned in order to keep the car on the road, while the car main- tains a constant speed. In this case, one can feel a force normal to the road. It is larger for sharper turns (larger curvature or smaller curvature radius) and also grows when the same turn is passed with a greater speed. This force is due to the normal acceleration, aN 2 and is called a centrifugal force. By Newton's law, its magnitude is  84. APPLICATIONS TO MECHANICS AND GEOMETRY 149 F = maN =mnv2/p, where m is the mass of a moving object (e.g., a car). When making a turn, the car does not slide off the road as long as the friction force between the tires and the road compensates for the centrifugal force. The maximal friction force depends on the road and tire conditions (e.g., a wet road and worn tires reduce substantially the maximal friction force). The centrifugal force is determined by the speed (the curvature of the road is fixed by the road shape). So, for a high enough speed, the centrifugal force can no longer be compensated for by the friction force and the car would skid off the road. For this reason, suggested speed limit signs are often placed at highway exits. If one drives a car on a highway exit with a speed twice as high as the suggested speed, the risk of skidding off the road is quadrupled, not doubled, because the normal acceleration aN = 2/p quadruples when the speed v is doubled. EXAMPLE 12.19. A road has a parabolic shape, y = x2/(2R), where (x, y) are coordinates of points of the road and R is a constant (all measured in units of length, e.g., meters). A safety assessment requires that the normal acceleration on the road should not exceed a threshold value am (e.g., meters per second squared) to avoid skidding off the road. If a car moves with a constant speed vo along the road, find the portion of the road where the car might skid off the road. SOLUTION: The normal acceleration of the car as a function of position (not time!) is aN~i) -x)vo. The curvature of the graph y = x2/(2R) is s(x) = (1/R) [1 + (x/R)2]-3/2. The maximal curvature and hence the maximal normal acceleration are attained at x = 0. So, if the speed is such that aN (0) v/R < am, no accident can happen. Otherwise, the inequality aN (x) am yields v2 1 (v22/3 > >am > x|< R v- 1, v= R [1 + (z/R)2]3/2 Ram The constant v = aN(0)/am always exceeds 1 if aN(0) v/R > am. The car can skid off the road when moving on its part corresponding to the interval -R(v - 1)1/2 < x < R(v - 1)1/2. Conversely, a suggested speed limit sign can be placed vo < Ram for a part of the road that contains x = 0. D 84.2. Frenet-Serret Formulas. The shape of a space curve as a point set is independent of a parameterization of the curve. A natural question arises: What parameters of the curve determine its shape? Suppose the curve is smooth enough so that the unit tangent vector T(s) and its derivative T' (s) can be defined as functions of the arc length s counted  150 12. VECTOR FUNCTIONS 150 12. VECTOR FUNCTIONS from an endpoint of the curve. Let N(s) be the unit normal vector of the curve. DEFINITION 12.18. (Binormal Vector). Let T and N be the unit tangent and normal vectors at a point of a curve. The unit vector B T=i x N is called the binormal (unit) vector. So, with every point of a smooth curve, one can associate a triple of mutually orthogonal unit vectors so that one of them is tangent to the curve while the other two span the plane normal to the tangent vector (normal to the curve). By a suitable rotation, the triple of vectors T, N, and B can be oriented parallel to the axes of any given coordinate system, that is, parallel to ei, 62, and e3, respectively. Indeed, T and N can always be made parallel to ei and 82. Then, owing to the relation ei x e2 = e3, the binormal must be parallel to e3. In other words, T, N, and B define a right-handed coordinate system. The orientation of the unit tangent, normal, and binormal vectors relative to some coordinate system depends on the point of the curve. The triple of these vectors can only rotate as the point slides along the curve (the vectors are mutually orthogonal and unit at any point). Therefore, the rates with respect to the are length at which these vectors change must be characteristic for the shape of the curve (see Figure 12.11, right panel). By the definition of the curvature, T'(s) s(s)N(s). Next, con- sider the rate: B'= (T x N)'= T' x N+-Tx N'= Tx N' because T'(s) is parallel to N(s). It follows from this equation that B' is perpendicular to T, and, since B is a unit vector, its derivative must also be perpendicular to B. Thus, B' must be parallel to N. This conclusion establishes the existence of another scalar quantity that characterizes the curve shape. DEFINITION 12.19. (Torsion of a Curve). Let N(s) and B(s) be unit normal and binormal vectors of the curve as functions of the arc length s. Then dB(s) =s-T(s)N(s), ds where the number T(s) is called the torsion of the curve. By definition, the torsion is measured in units of a reciprocal length, just like the curvature, because the unit vectors T, N, and B are dimensionless.  84. APPLICATIONS TO MECHANICS AND GEOMETRY 151 At any point of a curve, the binormal B is perpendicular to the osculating plane. So, if the curve is planar, then B does not change along the curve, B'(s) = 0, because the osculating plane at any point coincides with the plane in which the curve lies. A planar curve has no torsion. Thus, the torsion is a local numerical characteristic that determines how fast the curve deviates from the osculating plane while bending in it with some curvature radius. It follows from the relation N = x T (compare 62 = 83 x 01) that where the definitions of the torsion and curvature have been used. The obtained rates of the unit vectors are known as the Frenet-Serret for- mulas or equations: (12.9) T'(s) =K (s)N(s) , (12.10) N'(s) = -K(s)T(s) + T(s)B(s) , (12.11) B'(s) = -T(s)N(s) . The Frenet-Serret equations form a system of differential equations for the components of T(s), N(s), and B(s). If the curvature and torsion are continuous functions on an interval 0 < s < L, then the system can be proved to have a unique solution on this interval for every given set of the vectors T(0), N(0), and B(0) at an initial point of the curve. Given a coordinate system, the initial point of a curve is specified by a translation of the origin, and the orientation of T(0), N(0), and B(0) is determined by a rotation of unit coordinate vectors. Therefore, the following assertion about the shape of a curve holds. THEOREM 12.9. (Shape of a Smooth Curve in Space). Given the curvature and torsion as continuous functions along a curve, the curve is uniquely determined by them up to rigid rotations and translations of the curve as a whole. A proof of Theorem 12.9 requires a proof of the uniqueness of a solution to the Frenet-Serret equations, which goes beyond the scope of this course. However, in some specific examples, the Frenet-Serret equations can be explicitly integrated. For example, consider curves with the vanishing curvature and torsion, s(s) T=r(s) = 0. Then T(s) =T(0), N(s) =N(0), and B(s) B (0). If r(s) is a natural parameterization of a curve, then r'(s) =T(s) =T(0). The integration of this equation yields r(s) =0 r+T(0)s, where ro is a constant vector, which is a straight line.  152 12. VECTOR FUNCTIONS 152 12. VECTOR FUNCTIONS EXAMPLE 12.20. Use the Frenet-Serret equations to prove that a curve with a constant curvature s(s) = KO / 0 and zero torsion T(s) - 0 is a circle (or its portion) of radius R = 1/K0. SoLUTIoN: A vector function r(s) that satisfies the Frenet-Serret equa- tions is sought in the basis of the initial tangent, normal, and binormal vectors: ei T=T(0), 62 = N(0), and 63 B=$(0). Since the torsion is 0, the binormal does not change along the curve, B(s) = e3. The curve is planar and lies in a plane orthogonal to e3. Any unit vector T or- thogonal to e3 can be written as T= cos pe1 + sin pe2 where yo=cp(s) such that (0) = 0. Owing to the relations ei x e2 = -e2 x ei = e3, a unit vector N orthogonal to T such that T x N = B = e3 must have the form N= - sin pei + cos pe2. Equation (12.9) gives T' =_-c' sin pe1 + ip'cos y e2 =p'N= so -> p'(s) = mo, and therefore cp(s) = cos because (0) = 0. For a natural parameteri- zation of the curve, r'(s) T=T(s). Hence, r'(s) = cos(ios)ei + sin(tos)e2, r(s) = ro + X0V sin(tos)ei - --1 cos(tos)e2, where ro is a constant vector. By the Pythagorean theorem, the dis- tance between any point of the curve and a fixed point ro is constant: r(s) - roll2 = 1/s = R2. Since the curve is planar, it is a circle (or its portion) of radius R = 1/so. D THEOREM 12.10. (Torsion of a Curve). Let r(t) be a three times differentiable vector function that traverses a smooth curve whose curvature does not vanish. Then the torsion of the curve is Trt) =(r'(t) x r"(t)) - r"'(t) |r'(t) x r"(t)||2 PROOF. Put ||r'(t)||1= v(t) (if s= s(t) is the are length as a function of t, then s' = v). By (12.6) and the definition of the curvature, (12.12) r" = v'T + sv2N, and by (12.7) and the definition of the binormal, (12.13) r' x r" = vix r" =svaB. Differentiation of both sides of (12.12) gives r"' =v" + v'i' + (s'v2 + 2svv')N + sv2N'.  84. APPLICATIONS TO MECHANICS AND GEOMETRY 153 The derivatives T'(t) and N'(t) are found by making use of the differen- tiation rule d/ds = (1/s'(t))(d/dt) = (1/v)(d/dt) in the Frenet-Serret equations (12.9) and (12.10): Therefore, (12.14) r"' = (v" - 213)T + (3svv' + K 'v2)N + TV3B. Since the tangent, normal, and binormal vectors are unit and orthogo- nal to each other, (r' x r") - r"' = sv3(r' x r") B ={2v6T. Therefore, _(r' x r") -r"' I=2v6 and the conclusion of the theorem follows from Theorem 12.7, s |r x r"||/v3. Remark. Relation (12.13) shows that B is the unit vector in the direction of r' x r". This observations offers a more convenient way for calculating the unit binormal vector than its definition. The unit tangent, normal, and binormal vectors at a particular point r(to) of the curve r(t) are r'(to) _r'(to)_xr"/(to) i~o 1r/( , , $(to) I= , ,,) , $t) (to) $(to) x i(to). EXAMPLE 12.21. Find the unit tangent, normal, and binormal vec- tors and the torsion of the curve r(t) (lnt, t, t2/2) at the point (0, 1, 1/2). SOLUTION: The point in question corresponds to t = 1. Therefore, r'(1) = (2 1, , t) = (1, 1, 1) -)r,(1)||= /'5, t=1 r"() = (-t-2, 0, 1) = ( 1, 0, 1),  154 12. VECTOR FUNCTIONS 154 12. VECTOR FUNCTIONS T (1) = (1, 1, 1), 3 EB(1)_= (1, -2, 1), 1 1 N(1) = (1, -2, 1) x (1, 1, 1) = (-3, 0, 3), v/3 32 1 = (-1, 0, 1) v/2 __(r'(1) x r"(1)) r"'(1) __2 1 Ir'(1) x r"(1)|2 6 3 84.3. Approximations of a Smooth Space Curve. A smooth curve C has a unit tangent vector at a point P. So a small part of the curve (a part of a small are length As) containing P can be approximated by a piece of the tangent line of the same length As. If the curve C has a nonzero curvature at P, then a better approximation can be obtained by a part of the osculating circle of are length As (see Study Problem 12.14). If the curve C has a nonzero torsion at P, an even more ac- curate approximation is provided by a curve through P that has the same unit tangent vector at P, and constant curvature and torsion equal to the curvature and torsion of the curve C at P. By Theorem 12.9, such a curve is unique. As shown in Study Problem 12.18, it is a helix whose radius and length of each turn are uniquely determined by the curvature and torsion. These three successively more accurate approximations do not refer to any particular coordinate system or any particular parameterization of C as the approximation curves are fully determined as the point sets in space by the geometrical invariants of the curve C at P: the unit tangent vector, curvature, and torsion. An analogy can be made with the Taylor polynomial approximation of a function at a particular point. The tangent line is the analog of the first-order Taylor polynomial (a linear approximation), the osculating circle is the analog of the second-order Taylor polynomial (a quadratic approximation), and the helix is the analog of the third-order Taylor polynomial (a cubic approximation). Given T, N, and B of the curve C at F, the Frenet-Serret equations can used to obtain unique higher- order approximations of C near P by approximating the curvature rsjs) and the torsion T(s) of C near P. The helix approximation uses the constant approximations of the curvature and torsion by their values at P. If, for example, the curvature and torsion of C is known at  84. APPLICATIONS TO MECHANICS AND GEOMETRY 155 two points near P, then s(s) and T(s) can be approximated by linear functions near P that attain the two known values. The correspond- ing (unique) solution of the Frenet-Serret equations would generally provide a more accurate approximation than the helix approximation. 84.4. Study Problems. Problem 12.16. Find the position vector r(t) of a particle as a func- tion of time t if the particle moves clockwise along a circular path of ra- dius R in the xy plane through r(0) = (R, 0, 0) with a constant speed vo. SOLUTION: For a circle of radius R in the xy plane through the point (R,0,0), r(t) = (Rcosy5,Rsin5, 0), where cy5= (t) such that p(0) 0. Then the velocity is v(t) = r'(t) = p'(-R sincp, R cospc, 0). Hence, the condition ||v(t)|| = vo yields R p'(t)| = vo or y(t) =+(vo/R)t and r(t) =_(Rcos(wt),+R sin(wt), 0) , where w= vo/R is the angular velocity. The second component must be taken with the minus sign because the particle revolves clockwise (the second component should become negative immediately after t = 0). D Problem 12.17. Let the particle position vector as a function of time t be r(t) = (ln(t), t2, 2t), t > 0. Find the speed, tangential and normal accelerations, the unit tangent, normal, and binormal vectors, and the torsion of the trajectory at the point Po(0, 1, 2). SOLUTION: By Example 12.16, the velocity and acceleration vectors at Po are v = (1, 2, 2) and a = (-1, 2, 0). So the speed is v= ||v|| I=3. The tangential acceleration is aTr= v.a/v = 1. As vxa = 2(-2, -1, 2), the normal acceleration is aN v x a /v =6/3 = 2. The unit tangent vector is T= v/v = (1/3)(1, 2, 2), and the unit binormal vector is B= v x a/llv x a|| = (1/3)(-2, -1, 2) as the unit vector along v x a. Therefore, the unit normal vector is N =T x B= (1/9)v x (v x a) (1/3)(-2, 2, -1). To find the torsion at Po, the third derivative at t = 0 has to be calculated, r"'(1) = (2/t2, 0, 0)| 1= (2, 0, 0) = b. Therefore, T(1) = (v x a) - b/llv x a||2 =-8/36 = -2/9. D Problem 12.18. (Curves with Constant Curvature and Torsion). Prove that all curves with a constant curvature s(s) = so 0 and a constant torsion T(s) T= ro 0 are helices by integrating the Frenet- Serret equations. SOLUTION: It follows from (12.9) and (12.11) that the vector w= TTi+ ii$ does not change along the curve, w'(s) =0. Indeed, because  156 12. VECTOR FUNCTIONS 156 12. VECTOR FUNCTIONS si'(s) T'(s) = 0, one has w' =TT' + iB' =(Ti - T)N = 0. By the Pythagorean theorem, ||w|| (io+ r)1/2. Consider two new unit vectors orthogonal to N: S = w = sinaT + cosaB, u= cosaT - sinaB, |w1l where cos a =io/w, sina T=ro/w, and w = (i + T!)1/2. By con- struction, the unit vectors 6, w, and N are mutually orthogonal unit vectors, which is easy to verify by calculating the corresponding dot products, 6 -"6u= w - w = 1 and 6i-iw = 0. Also, n x w= cos2 a T x B - sin2 a B x T= (cos2 a + sin2 a)N= N. By differentiating the vector 6 and using the Frenet-Serret equations, 6' = cos aT' - sin aB' =_(smocos ac+To sin a)N = wN. Since W^(s) =W^(0) is a constant unit vector, it is convenient to seek a solution in an orthonormal basis such that e3 = (0) and ei x e2 = e3. In this basis, 6 = cos y 81 + sin c 62, where c =cp(s), as a unit vector in the plane orthogonal to es. The orientation of the basis vectors in the plane orthogonal to es is defined up to a general rotation about e3. This freedom is used to set ei =i6(0), which implies that the function y(s) satisfies the condition 5(O) = 0. Then the unit normal vector in this basis is N =f x w =cos oei x e3+ sin o62 x e3= -sinyp61+ecos p e2 and 6' =-o' sin e1+/'cos e24='N. Hence, y'(s) = w or y(s) w=os owing to the condition 5(O) = 0. Expressing the vector T via n and w, T = cos a n + sin a one infers (compare Example 12.20) r'(s) T=T(s) =_ cos(ws) ei + sin(ws) e2 + 3, w w w where r(s) is a natural parameterization of the curve. The integration of this equation gives r(s) =ro + R sin(ws) ei - R cos(ws) 62 + hs es, R =2$ h= . This is a helix of radius R whose axis goes through the point r0 parallel to es; the helix climbs along its axis by 27rh/w per each turn. D  84. APPLICATIONS TO MECHANICS AND GEOMETRY 157 Problem 12.19. (Motion in a Constant Magnetic Field, Revisited). The force acting on a charged particle moving in the magnetic field B is given by F = (e/c)v x B, where e is the electric charge of the particle, c is the speed of light, and v is its velocity. Show that the trajectory of the particle in a constant magnetic field is a helix whose axis is parallel to the magnetic field. SOLUTION: In contrast to Study Problem 12.11, here the shape of the trajectory is to be obtained directly from Newton's second law with arbitrary initial conditions. Choose the coordinate system so that the magnetic field is parallel to the z axis, B = Be3, where B is the magnitude of the magnetic field. Newton's law of motion, ma = F, where m is the mass of the particle, determines the acceleration, a = pv x B = pBv x e3, where y,= e/(mc). First, note that v3 = a3 es -a = 0. Hence, v3 = vil = const. Second, by the geometrical property of the cross product the acceleration and velocity remain orthogonal during the motion, and therefore the tangential acceleration vanishes, aT = v - a = 0. Hence, the speed of the particle is a constant of motion, v = vo (because v' = aT= 0). Put v = v1 + vile3, where v1 is the projection of v onto the xy plane. Since ||v|| = vo, the magnitude of v1 is also constant, ||v 1 ll= v1 = (vo - v)1/2. The velocity vector can therefore be written in the form v = (vi cos y, v1 sin p, vii), where the function = = p(t) is to be determined by the equations of motion: a =pB v x e3 = B (-vi sin p, v1 cos p, 0), a = v'= o'(-vi sin p, vI cos p, 0). It follows from the comparison of these expressions that p'(t) = pB or y(t) =,pBt + po = wt + po, where w= eB/(mc) is the so-called cyclotron frequency and the integration constant Ypo is determined by the initial velocity: v(0) = (v1 cos o, v1 sin po, vii), that is, tan po v2(0)/v1(0). Integration of the equation r'(t) = v(t) - (vi cos(wt + po), v1 sin(wt + po), vii yields the trajectory of motion: r(t) = ro + (Rsin(wt + po), -R cos(wt + po), vt), where R = v1/w. This equation describes a helix of radius R whose axis goes through r0 parallel to the z axis. So a charged particle moves along a helix that winds about force lines of the magnetic field. The particle revolves in the plane perpendicular to the magnetic field with frequency w =eB/(mc). In each turn, the particle moves along the magnetic field a distance h =27lw. In particular, if the initial  158 12. VECTOR FUNCTIONS 158 12. VECTOR FUNCTIONS velocity is orthogonal to the magnetic field (i.e., vil = 0), then the trajectory is a circle of radius R. The Polar Lights. The Sun produces a stream of charged particles (the solar wind). The magnetic field of the Earth plays the role of a shield from the solar wind as it traps the particles, forcing them to travel along its force lines that are arcs connecting the magnetic poles of the Earth (which approximately coincide with the south and north poles). As a result, the solar wind particles can penetrate the lower atmosphere only near the magnetic poles of the Earth, causing a spectacular phenomenon, the polar lights, by colliding with molecules of the oxygen and nitrogen in the atmosphere. D Problem 12.20. Suppose that the force acting on a particle of mass m is proportional to the position vector of the particle (such forces are called central). Prove that the angular momentum of the particle, L = mr x v, is a constant of motion (i.e., dL/dt = 0). SOLUTION: Since a central force F is parallel to the position vector r, their cross product vanishes, r x F = 0. By Newton's second law, ma = F and hence mr x a = 0. Therefore, dL = m(r x v)' =m(r' x v + r x v') =mr x a =0, dt where r' =v, v' = a, and v x v = 0 have been used. D Problem 12.21. (Kepler's Laws of Planetary Motion). Newton's law of gravity states that two masses m and M at a distance r are attracted by a force of magnitude GmM/r2, where G is the universal constant (called Newton's constant). Prove Kepler's laws of planetary motion: 1. A planet revolves around the Sun in an elliptical orbit with the Sun at one focus. 2. The line joining the Sun to a planet sweeps out equal areas in equal times. 3. The square of the period of revolution of a planet is proportional to the cube of the length of the major axis of its orbit. SOLUTION: Let the Sun be at the origin of a coordinate system and let r be the position vector of a planet. The mass of the Sun is much larger than the mass of a planet; therefore, a displacement of the Sun due to the gravitational pull from a planet can be neglected (e.g., the Sun is about 332,946 times heavier than the Earth). Let f- r/r be  84. APPLICATIONS TO MECHANICS AND GEOMETRY 159 the unit vector parallel to r. Then the gravitational force is GMm GMm F- 72 73 r, where M is the mass of the Sun and m is the mass of a planet. The minus sign is necessary because an attractive force must be opposite to the position vector. By Newton's second law, the trajectory of a planet satisfies the equation ma = F and hence GM a=- -r. The gravitational force is a central force, and, by Study Problem 12.20, the vector r x v = 1 is a constant of motion. One has v = r' = (ri)' r'ii+ r'. Using this identity, the constant of motion can also be written as 1= r xv =rr xv =r(r'r x F + rr x r') =r2(r xr) Using the rule for the double cross product (see Study Problem 11.17), one infers that GM a x1= - 2 x 1= -GMr x ( xrt) =GMr', r where 1 - r= 1 has been used. On the other hand, (v x 1)' =v' x 1+ v x l' =a x 1 because 1' = 0. It follows from these two equations that (12.15) (vx1)'=GM' = vx1 =GMr+c, where c is a constant vector. The motion is characterized by two constant vectors 1 and c. It occurs in the plane through the origin that is orthogonal to the constant vector 1 because 1= r x v must be orthogonal to r. It also follows from (12.15) and 1.- r = 0 that the constant vectors 1 and c are orthogonal because 1- c = 0. It is therefore convenient to choose the coordinate system so that 1 is parallel to the z axis and c to the x axis as shown in Figure 12.12 (left panel). The vector r lies in the xy plane. Let 0 be the polar angle of r (i.e., r - c = rc cosO8, where c =||c|| is the length of c). Then r - (v x 1) - r - (GM + c) - GMr + rc cosO6. On the other hand, using a cyclic permutation in the triple product, r - (v x 1) - 1.- (r x v)= - 1= 12  160 12. VECTOR FUNCTIONS z rY 0 b AR Yy r(t2) (ti) x .x c r .... x FIGURE 12.12. Left: The setup of the coordinate system for the derivation of Kepler's first law. Right: An illustra- tion to the derivation of Kepler's second law. where l = 1|| is the length of 1. The comparison of the last two equations yields the equation for the trajectory: ed l2 = r(GM + bcos 0) r 1+ ecos' where d = l2/c and e = c/(GM). This is the polar equation of a conic section with focus at the origin and eccentricity e (see Calculus II). Thus, all possible trajectories of any massive body in a solar sys- tem are conic sections! This is a quite remarkable result. Parabolas and hyperbolas do not correspond to a periodic motion. So a planet must follow an elliptic trajectory with the Sun at one focus. All ob- jects coming to the solar system from outer space (i.e., those that are not confined by the gravitational pull of the Sun) should follow either parabolic or hyperbolic trajectories. To prove Kepler's second law, put r = (cos 0, sin 0, 0) and hence -0' sin 0, 0' cos 0, 0). Therefore, 1 = r2(r x r') = (0, 0, r20') l = r20'. The area of a sector with angle dO swept by r is dA = 2r2 dO (see Calculus II; the area bounded by a polar graph r = r(0)). Hence, dA _1 2d0 _ L dt 2 dt 2  84. APPLICATIONS TO MECHANICS AND GEOMETRY 161 For any moments of time ti and t2, the area of the sector between r(ti) and r(t2) is /'t2 dA ft2 A12 = dtcit =i-dt -(t2 - t1). 1dt i12 2 Thus, the position vector r sweeps out equal areas in equal times (see Figure 12.12, right panel). Kepler's third law follows from the last equation. Indeed, the entire area of the ellipse A is swept when t2 - ti= T is the period of the motion. If the major and minor axes of the ellipse are 2a and 2b, respectively, a > b, then A =grab =lT/2 and T = 2wab/l. Now recall that ed = b2/a for an elliptic conic section (see Calculus II) or b2 = eda =2a/(GM). Hence, 2 4w12a2b2 472a3 T2 12 =GMa. Note that the proportionality constant 472/(GM) is independent of the mass of a planet; therefore, Kepler's laws are universal for all massive objects trapped by the Sun (planets, asteroids, and comets). Q 84.5. Exercises. (1) For each of the following trajectories of a particle, find the velocity, speed, and normal and tangential accelerations as functions of time and their values at a specified point P: (i) r(t)_= (st, 1-t, t2 + 1), P(1, 0, 2) (ii) r (t) = (t2, t, 1) , P(4, 2, 1) (iii) r(t) = (4t3/2, -t2t) P(,- ,1 (iv) r (t) = (ln t, v/t, t2) , P(0, 1, 1) (v) r (t) = (cosh t, sinh t, 2 + t), P(1, 0, 2) (vi) r(t) = (et, 2v/2t, e--), P(1, 0, 1) (vii) r(t) = (sin t - t cos t, t2, cost + t sin t), P(0, 0, 1) (2) Find the normal and tangential accelerations of a particle with the position vector r(t) = (t2 + 1, t, t2 - 1) when the particle is closest to the origin. (3) Find the tangential and normal accelerations of a particle with the position vector r(t) = (R sin(wt + po), -R cos(wt + po), vot), where R, w, po, and v0 are constants (see Study Problem 12.19). (4) The shape of a winding road can be approximated by the graph y =L cos(x/L), where the coordinates are in meters and L =40m. The condition of the road is such that if the normal acceleration of a car on it exceeds 10 m/s2, the car may skid off the road. Recommend  162 12. VECTOR FUNCTIONS 162 12. VECTOR FUNCTIONS a speed limit for this portion of the road. (5) A particle moves along the curve y = x2 + x3. If the acceleration of the particle at the point (1, 2) is a = (3, -1), find its normal and tangential accelerations. (6) Suppose that a particle moves so that its tangential acceleration aT is constant, while the normal acceleration aN remains 0. What is the trajectory of the particle? (7) Suppose that a particle moves in a plane so that its tangential acceleration aT remains 0, while the normal acceleration aN is constant. What is the trajectory of the particle? Hint: Investigate the curvature of the trajectory. (8) A race car moves with a constant speed vo along an elliptic track x2/a2 + y2/b2 = 1, a > b. Find the maximal and minimal values of the magnitude of its acceleration and the points where they occur. (9) Does there exist a curve with zero curvature and nonzero torsion? Explain the answer. (10) For each of the following curves, find the unit tangent, normal, and binormal vectors and the torsion at a specified point P: (i) r (t) = (t, 1 - t, t2 + 1), P(1, 0, 2) (ii) r (t) = (t3, t2, 1), P(8, 4, 1) (iii) r(t) = (4t3/2, -t2, ) ( ,- ,1 (iv) r (t) = (ln t, v/t, t2), P(0, 1, 1) (v) r(t) = (cosh t, sinht, 2 + t), P(1, 0, 2) (11) Let r(t) = (cost + t sin t, sin t + t cos t, t2). Find the speed, the tangential and normal accelerations, the curvature and torsion, and the unit tangent, normal, and binormal vectors as functions of time t. Hint: To simplify calculations, find the decomposition r(t) = v(t) - tw(t) + t263, where v, w, and e3 are mutually orthogonal unit vectors such that v'(t) = w(t), w'(t) = -v(t). Use the properties of the cross products of mutually orthogonal unit vectors. (12) Let C be the curve of intersection of an ellipsoid x2/a2 + y2/b2 + z2/c2 = 1 with the plane 2x - 2y + z = 0. Find the torsion and the binormal B along C.  CHAPTER 13 Differentiation of Multivariable Functions 85. Functions of Several Variables The concept of a function of several variables can be qualitatively understood from simple examples in everyday life. The temperature in a room may vary from point to point. A point in space can be defined by an ordered triple of numbers that are coordinates of the point in some coordinate system, say, (x, y, z). Measurements of the temperature at every point from a set D in space assign a real number T (the temper- ature) to every point of D. The dependence of T on coordinates of the point is indicated by writing T = T(x, y, z). Similarly, the concentra- tion of a chemical can depend on a point in space. In addition, if the chemical reacts with other chemicals, its concentration at a point may also change with time. In this case, the concentration C = C(x, y, z, t) depends on four variables, three spatial coordinates and the time t. In general, if the value of a quantity f depends on values of several other quantities, say, x1, x2,..., xn, this dependence is indicated by writing f f(xi, x2, ..., x). In other words, f = f(xi, x2, ..., x) indicates a rule that assigns a number f to each ordered n-tuple of real numbers (X1, x2, ..., x). Each number in the n-tuple may be of a different nature and measured in different units. In the above example, the concentra- tion depends on ordered quadruples (x, y, z, t), where x, y, and z are the coordinates of a point in space (measured in units of length) and t is time (measured in units of time). All ordered n-tuples form an n-dimensional Euclidean space, much like all ordered doublets (x, y) form a plane, and all ordered triples (x, y, z) form a space. 85.1. Euclidean Spaces. With every ordered pair of numbers (x, y), one can associate a point in a plane and its position vector relative to a fixed point (0, 0) (the origin), r = (x, y). With every ordered triple of numbers (x, y, z), one can associate a point in space and its position vector (again relative to the origin (0, 0, 0)), r = (x, y, z). So the plane can be viewed as the set of all two-component vectors; similarly, space is the set of all three-component vectors. From this point of view, the plane and space have characteristic common features. First, their elements are vectors. Second, they are closed relative to addition of 163  164 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS vectors and multiplication of vectors by a real number; that is, if a and b are elements of space or a plane and c is a real number, then a+b and ca are also elements of space (ordered triples of numbers) or a plane (ordered pairs of numbers). Third, the norm or length of a vector ||r|| vanishes if and only if the vector has zero components. Consequently, two elements of space or a plane coincide if and only if the norm of their difference vanishes, that is, a = b <> |a - b|| = 0. Finally, the dot product a-b of two elements is defined in the same way for two- or three- component vectors (plane or space) so that ||a||2 = a - a. Since points and vectors are described by the same mathematical object, an ordered triple (or pair) of numbers, it is not necessary to make a distinction between them. So, in what follows, the same notation is used for a point and a vector, for example, r = (x, y, z). These observations can be extended to ordered n-tuples for any n and lead to the notion of a Euclidean space. DEFINITION 13.1. (Euclidean Space). For each positive integer n, consider the set of all ordered n-tuples of real numbers. For any two elements a = (a1, a2, ..., an) and b (b1, b2, ..., bn) and a number c, put a +b = (a1 + bi, a2 + b2, ..., an + b), c a = (ca1, ca2, ..., can), a - b = a11 + a2b2 +-..-+ anb, a|| = \/-a - a = a + as+ ---+a2. The set of all ordered n-tuples in which the addition, the multiplication by a number, the dot product, and the norm are defined by these rules is called an n-dimensional Euclidean space. Two points of a Euclidean space are said to coincide, a = b, if the corresponding components are equal, that is, a = b2 for i = 1, 2, ..., n. It follows that a = b if and only if ||a - b|| = 0. Indeed, by the definition of the norm, c||I= 0 if and only if c = (0, 0, ..., 0). Put c = a-b. Then ||a - b|| = 0 if and only if a =b. The number ||a - b is called the distance between points a and b of a Euclidean space. The dot product in a Euclidean space has the same geometrical properties as in two and three dimensions. The Cauchy-Schwarz in- equality can be extended to any Euclidean space (cf. Theorem 11.2). THEOREM 13.1. (Ca uchy-Schwarz Inequality).  85. FUNCTIONS OF SEVERAL VARIABLES 165 for any vectors a and b in a Euclidean space, and the equality is reached if and only if a = tb for some number t. PROOF. Put a = ||a|| and b = ||bl|, that is, a2 = a - a and similarly for b. If b = 0, then b = 0, and the conclusion of the theorem holds. For b / 0 and any real variable t, ||a - tb||2 = (a - tb) - (a - tb) >0. Therefore, a2 - 2tc + t2b2 > 0, where c = a - b. Completing the squares on the left side of this inequality, C2 2 2 (bt-) -+ a2 > 0, b b shows that the left side attains its absolute minimum when the expres- sion in the parentheses vanishes, that is, at t = c/b2. Since the inequal- ity is valid for any t, it is satisfied for t = c/b2, that is, a2 - c2/b2 > 0 or c2 < a2b2 or |cl < ab, which is the conclusion of the theorem. The inequality becomes an equality if and only if ||a - tb||2 = 0 and hence if and only if a = tb. D It follows from the Cauchy-Schwarz inequality that a- b = s|a||||bl|, where s is a number such that Is| < 1. So one can always put s = cos 0, where 0 E [0, 7]. If 0 = 0, then a = tb for some positive t > 0 (i.e., the vectors are parallel), and a = tb, t < 0, when w= (i.e., the vectors are antiparallel). The dot product vanishes when 0 = 7/2. This allows one to define 0 as the angle between two vectors in any Euclidean space: cos 0 = a - b/(lla||||b|) much like in two and three dimensions. Consequently, the triangle inequality (11.7) holds in a Euclidean space of any dimension. 85.2. Real-Valued Functions of Several Variables. DEFINITION 13.2. (Real-Valued Function of Several Variables). Let D be a set of ordered n-tuples of real numbers (x1, x2, ..., xn). A function f of n variables is a rule that assigns to each n-tuple in the set D a unique real number denoted by f x1,x2,"..., xn). The set D is the domain of f, and its range is the set of values that f takes on it, that is, {f(xi, x2, ..., x) |I(xi, x2, ..., x) E D}. This definition is illustrated in Figure 13.1. The rule may be defined by different means. If D is a finite set, a function f can be defined by a table (P, f(P)), where Pi E D, i = 1, 2, ..., N, are elements (ordered n-tuples) of D, and f(P) is the value of f at P. A function f can be defined geometrically. For example, the height of a mountain relative to sea level is a function of its position on the globe. So the height is a function of two variables, the longitude and latitude. A function can be defined by an algebraic rule that prescribes algebraic operations to  166 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS f(P) Pf(P) X R f R f FIGURE 13.1. Left: A function f of two variables is a rule that assigns a number f(P) to every point P of a planar region D. The set R of all numbers f(P) is the range of f. The region D is the domain of f. Right: A function f of three variables is a rule that assigns a number f(P) to every point P of a solid region D. be carried out with real numbers in any n-tuple to obtain the value of the function. For example, f(x, y, z) = x2 _y + z3. The value of this function at (1, 2, 3) is f (1, 2, 3) = 12 - 2 + 33 = 28. Unless specified otherwise, the domain D of a function defined by an algebraic rule is the set of n-tuples for which the rule makes sense. EXAMPLE 13.1. Find the domain and the range of the function of two variables f(x, y) = ln(1 - x2 - y2) SOLUTION: The logarithm is defined for any strictly positive number. Therefore, the doublet (x, y) must be such that 1 - x2 - g2 > 0 or x2+y2 < 1. Hence, D = {(x, y) |x2+ y2 < 1}. Since any doublet (x, y) can be uniquely associated with a point on a plane, the set D can be given a geometrical description as a disk of radius 1 whose boundary, the circle x2 + y2 = 1, is not included in D. For any point in the interior of the disk, the argument of the logarithm lies in the interval 0 < 1 - x2 - y2 < 1. So the range of f is the set of values of the logarithm in the interval (0, 1], which is -oc < f < 0. Q EXAMPLE 13.2. Find the domain and the range of the function of three variables f (x, y, z) = x2 z - x2 - y2. SOLUTION: The square root is defined only for nonnegative numbers. Therefore, ordered triples (x, y, z) must be such that z - x2 - g2 > 0, that is, D = {(x, y, z) z > x2+y2}. This set can be given a geometrical description as a point set in space because any triple can be associated with a unique point in space. The equation z = x2+ y2 describes a cir- cular paraboloid. So the domain is the spatial (solid) region containing points that lie on or above the paraboloid. The function is nonnega- tive. By fixing x and y and increasing z, one can see that the value of f can be any positive number. So the range is 0 < f(x, y, z) < 00. Q  85. FUNCTIONS OF SEVERAL VARIABLES 167 -- z = f(x, y) .......... ...... D: 3 (X1 y 0) D xx FIGURE 13.2. Left: The graph of a function of two vari- ables is the surface defined by the equation z = f(x, y). It is obtained from the domain D of f by moving each point (x, y, 0) in D along the z axis to the point (x, y, f(x, y)). Right: The graph of the function studied in Example 13.3. In general, the domain of a function of n variables is viewed as a subset of an n-dimensional Euclidean space. It is also convenient to adopt the vector notation of the argument: f (zi, x2, ..., n) = f (r) , r = (X i, x2, ..., on) . For example, the domain of the function f (r) = - i - x - - xr)1/2 = (1 - r||2)1/2 is the set of points in the n-dimensional Euclidean space whose distance from the origin (the zero vector) does not exceed 1, D = {r r < 1}; that is, it is an n-dimensional ball of radius 1. So the domain of a multivariable function defined by an algebraic rule can be described by conditions on the components (coordinates) of the ordered n-tuple r under which the rule makes sense. 85.3. The Graph of a Function of Two Variables. The graph of a function of one variable f(x) is the set of points of a plane {(x, y)|y = f(x)}. The domain D is a set of points on the x axis. The graph is obtained by moving a point of the domain parallel to the y axis by an amount determined by the value of the function y = f(x). The graph provides a useful picture of the behavior of the function. The idea can be extended to functions of two variables. DEFINITION 13.3. (Graph of a Function of Two Variables). The graph of a function f (x, y) with domain D is the point set in space {(x, y, z) z = f(x, y), (x, y) E D}.  168 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS The domain D is a set of points in the xy plane. The graph is then obtained by moving each point of D parallel to the z axis by an amount equal to the corresponding value of the function z = f(x, y). If D is a portion of the plane, then the graph of f is generally a surface (see Figure 13.2, left panel). One can think of the graph as "mountains" of height f(x, y) on the xy plane. EXAMPLE 13.3. Sketch the graph of the function f (x, y) 1- (4/2)2 - (y/3)2. SOLUTION: The domain is the portion of the xy plane (x/2)2+(y/3)2 1; that is, it is bounded by the ellipse with semiaxes 2 and 3. The graph is the surface defined by the equation z = 1 - (4/2)2 - (y/3)2. By squaring both sides of this equation, one finds (4/2)2 + (y/3)2 + z2 1, which defines an ellipsoid. The graph is its upper portion with z > 0 as depicted in the right panel of Figure 13.2. D The concept of the graph is obviously hard to extend to functions of more than two variables. The graph of a function of three variables would be a three-dimensional surface in four-dimensional space. So the qualitative behavior of a function of three variables should be studied by different graphical means. 85.4. Level Sets. When visualizing the shape of quadric surfaces, the method of cross sections by coordinate planes has been helpful. It can also be applied to visualize the shape of the graph z = f(x, y). In particular, consider the cross sections of the graph with horizontal planes z = k. The curve of intersection is defined by the equation f(x, y) = k. Continuing the analogy that f(x, y) defines the height of a mountain, a hiker traveling along the path f(x, y) = k does not have to climb or descend as the height along the path remains constant. DEFINITION 13.4. (Level Sets). The level sets of a function f are subsets of the domain of f on which the function has a fixed value; that is, they are determined by the equa- tion f(r) = k, where k is a number from the range of f. For functions of two variables, the equation f(x, y) = k generally defines a curve, but not necessarily so. For example, if f(x, y) = x2+y2 then the equation 92+ y2 =k defines concentric circles of radii vk for any k > 0. However, for k =0, the level set consists of a single point (xc, y) =(0, 0). If f is a constant function on D, then it has just one level set that coincides with the entire domain D. A level set is called a level curve if the equation f~x, y) =k defines a curve. Recall that  85. FUNCTIONS OF SEVERAL VARIABLES 169 z = k3 z~k FIGURE 13.3. Left: Cross sections of the graph z = f(x, y) by horizontal planes z = ki, i = 1, 2, 3, are level curves f(x, y) = k2 of the function f. Right: The contour map of the function f consists of level curves f(x, y) = k. The number k2 indicates the value of f along each level curve. a curve in a plane can be described by parametric equations x = x(t), y = y(t), where x(t) and y(t) are continuous functions on an interval a < t < b. Therefore, the equation f(x, y) = k defines a curve if there exist continuous functions x(t) and y(t) such that f(x(t), y(t)) = k for all values of t from an interval. In general, a level set of a function may contain curves, isolated points, and even portions of the domain with nonzero area. DEFINITION 13.5. (Contour Map). A collection of level curves is called a contour map of the function f. The concepts of level curves and a contour map of a function of two variables are illustrated in Figure 13.3. The contour map of the function in Example 13.3 consists of ellipses. Indeed, the range is the interval [0, 1]. For any 0 < k < 1, a level curve is an ellipse, 1-(x/2)2-(y/3)2 k2 or (x/a)2 + (y/b)2 = 1, where a = 2v/1 - k2 and b = 3v/1 - k2. The level set for k = 1 consists of a single point, the origin. A contour map is a useful tool for studying the qualitative behavior of a function. Consider the contour map that consists of level curves Ci, i = 1, 2, ..., f(x, y) = k_, where k2+1 - k = Ak is fixed. The values of the function along the neighboring curves C and Ci+1 differ by Ak. So, in the region where the level curves are dense (close to one another),  170 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS the function f(x, y) changes rapidly. Indeed, let P be a point of CZ and let As be the distance from P to Ci+i along the normal to Ci. Then the slope of the graph of f or the rate of change of f at P in that direction is Ak/As. Thus, the closer the curves Ci are to one another, the faster the function changes. Contour maps are used in topography to indicate the steepness of mountains on maps. EXAMPLE 13.4. Describe the level sets (a contour map) of the func- tion f(x, y) = (a2 - (x2 + y2/4))2. SOLUTION: The function depends on the combination u2 = x2+ y2/4, f(x, y) = (a2 - t2)2, and therefore the level sets f(x, y) = k > 0 are ellipses. The level set k = 0 is the ellipse x2/a2 + y2/(2a)2=1. The level sets 0 < k < a4 contain two ellipses because the equation (a2- _ 2)2 = k has two solutions in this case: u2 = a2 + k> 0. For k = a4, the level set consists of the ellipse u2 = 2a2 and the point (x, y) = (0, 0). The level sets for k > a4 are ellipses u2 = - a2. So the contour map contains the ellipse x2/a2 + y2/(2a)2 = 1 along which the function attains its absolute minimum f(x, y) = 0. As the value of f increases, this ellipse splits into smaller and larger ellipses. At f(x, y) = a4 (a local maximum of f attained at the origin), the smaller ellipses collapse to a point and disappear, while the larger ellipses keep expanding in size. The graph of f looks like a Mexican hat. Q 85.5. Level Surfaces. In contrast to the graph, the method of level curves uses only the domain of a function of two variables to study its behavior. Therefore, the concept of level sets can be useful to study the qualitative behavior of functions of three variables. In general, the equation f(x, y, z) = k defines a surface in space, but not necessarily so as in the case of functions of two variables. The level sets of the function f (X, y, z) = X2 + y2 + z2 are concentric spheres x2 + y2 + z2 = k for k > 0, but the level set for k = 0 contains just one point, the origin. Intuitively, a surface in space can be obtained by a continuous de- formation (without breaking) of a part of a plane, just like a curve is obtained by a continuous deformation of a line segment. Let S be a nonempty point set in space. A neighborhood of a point P of S is a collection of all points of S whose distance from P is less than a num- ber 6 > 0. In particular, a neighborhood of a point in a plane is a disk centered at that point, and the boundary circle does not belong to the neighborhood. If every point of a subset D of a plane has a neighborhood that is contained in D, then the set D is called open. In other words, for every point P of an open region D in a plane, there is a disk of a sufficiently small radius that is centered at P and contained  85. FUNCTIONS OF SEVERAL VARIABLES 171 in D. A point set S is a surface in space if every point of S has a neighborhood that can be obtained by a continuous deformation (or a deformation without breaking) of an open set in a plane and this defor- mation has a continuous inverse. This is analogous to the definition of a curve as a point set in space given in Section 79.3. When the level sets of a function of three variables are surfaces, they are called level surfaces. The shape of the level surfaces may be studied, for example, by the method of cross sections with coordinate planes. A collection of level surfaces Si, f(x, y, z) = k2, ki+1 - k2 =Ak, i = 1, 2, ..., can be depicted in the domain of f. If P0 is a point on Si and P is the point on 5i+1 that is the closest to Po, then the ratio Ak/IPoPI determines the maximal rate of change of f at P. So the closer the level surfaces Si are to one another, the faster the function changes (see the right panel of Figure 13.4). EXAMPLE 13.5. Sketch and/or describe the level surfaces of the function f(x, y, z) = z/(1 + x2 + y2). SOLUTION: The domain is the entire space, and the range contains all real numbers. The equation f (x, y, z) = k can be written in the form z - k = k(x2 + y2), which defines a circular paraboloid whose symmetry axis is the z axis and whose vertex is at (0, 0, k). For larger k, the paraboloid rises faster. For k = 0, the level surface is the zy plane. For k > 0, the level surfaces are paraboloids above the zy plane; that is, they are concave upward (see the right panel of Figure 13.4). For k < 0, the paraboloids are below the zy plane (i.e., they are concave downward). D 85.6. Exercises. (1) Find and sketch the domain of each of the following functions: (i) f (X, y) =cz/y (ii) f (X, y) = X/(X2 + y2) (iii) fv(x, y) - x/(y2 - 4x2) (iv) fv(x, y) = ln(9 - x2 - (y/2)2) (v) f (x, y) =V1 - (4/2)2 - (y/3)2 (vi) f (x, y) = 4 - x2 -y2 + 2x In y (vii) f (x, y) = 4 - x2 _ y2 + x In y2 (viii) f~v, y) = 4 -2 _-y2 +ln(1 -cx2 - (y/2)2) (ix) f(cc, y, z) =cc/(yz) (x) f (cc,y, z) =c/(cc- y2 - z2) (xi) f (cc,y, z) = ln(1 - z +2 +y2) (xii) f(, y, z) ccf2y2 -z2 +ln(1cc 2_y2 -z2)  172 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (x1 y z) kki k3 FIGURE 13.4. Left: A level surface of a function f of three variables is a surface in the domain of f on which the function attains a constant value k; that is, it is defined by the equa- tion f(x, y, z) = k. Here three level surfaces are depicted. Right: Level surfaces of the function studied in Example 13.5. Here k2 > ki > 0 and k4 < k3 < 0. The level surface f(x, y, z) = 0 is the xy plane, z = 0. (xiii) f (t, r) = (t2 - r ||2)-1, r = (i, x2, ..., xn) (2) For each of the following functions, sketch the graph and a contour map: (i) f (x, y) = x2 + 4y2 (ii) f (x, y) = xy (iii) f (x, y) = x2 _ y2 (iv) f (x, y) = z2 -+9y2 (v) f (x, y) = sin x (3) Describe and sketch the level sets of each of the following functions: (i) f (x, y, z) = x + 2y + 3z (ii) f (x, y, z) = x2 + 4y2 + 9z2 (iii) f (x, y, z) = z + x2 + y2 (iv) f (x, y, z) = x2 + y2 - z2 (v) f (x, y, z) = ln(x2 + y2 - z2) (vi) f (x, y, z) = ln(z2 - x2 _ y2) (4) Sketch the level sets of each of the following functions. Here min(a, b) and max(a, b) denote the smallest number and the largest number of a and b, respectively, and min(a, a) = max(a, a) = a. (i) f (x, y) = x + Y  86. LIMITS AND CONTINUITY 173 (ii) (iii) (iv) (v) f(x, y)I I|+ y - I+y| f(x, y) = min(x, y) f (x, y) = max(Ixl, ly) f(x, y) = sign(sin(x) sin(y)); here sign(a) is the sign function, it has the values 1 and -1 for positive and negative a, respec- tively (vi) f (X, y, z) = (Xc+ y)2 + z2 (vii) f(x,y) =tan-1 (2j_ a2), a > 0 (5) Explain how the graph z = g(x, y) can be obtained from the graph of f(x,y) if (i) g(x, y) = k + f (x, y), where k is a constant (ii) g(x, y) = mff(x, y), where m is a nonzero constant (iii) g(x, y) = f(x - a, y - b), where a and b are constants (iv) g(x, y) = f(px, qy), where p and q are nonzero constants (6) Given a function f, sketch the graphs of g(x, y) defined in exercise 5. Analyze carefully various cases for values of the constants, for example, m> 0, m < 0, p> 1, 0

0. (8) Find f(x, y) if f (x + y, y/x) =-X2 -2. (9) Let z = y+ f( /z -1). Find the functions z and f if z = x when y=1. (10) Graph the function F(t) = f(cost, sin t), where f(x, y) = 1 if y ;> c and f(x, y) = 0 if y < c. Give a geometrical interpretation of the graph of F via the intersection of two surfaces. (11) Let f(u) be a continuous function for all real u. Investigate the relation between the shape of the graph of f and the shape of the following surfaces: (i) z = f (y - ax) (ii) z = f ( z2 + y2) (iii) z = f (- 2 +Xy21 (iv) z= f (z/y) 86. Limits and Continuity The function f(x) = sin(x)/x is defined for all reals except x = 0. So the domain D of the function contains points arbitrarily close to the point x = 0, and therefore the limit of f(x) can be studied as x - 0. It is known (see Calculus I) that sin(x)/c - 1 as x - 0. A similar question can be asked for functions of several variables. For  174 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS example, the domain of the function f(x, y) = sin(x2 + y2)/(x2 + y2) is the entire plane except the point (x, y) = (0, 0). In contrast to the one- dimensional case, the point (x, y) may approach (0, 0) along various paths. So the very notion that (x, y) approaches (0, 0) needs to be accurately defined. As noted before, the domain of a function f of several variables is a set in an n-dimensional Euclidean space. Two points x= (zi, cX2, ..., zf) and y = (Yi, y2, ..., yn) coincide if and only if the distance between them |xY- y l (z1-y1)2+(x2- 2)2+---+(x-y)2 vanishes. DEFINITION 13.6. A point r is said to approach a fixed point ro in a Euclidean space if the distance r - ro tends to 0. The limit r - ro|| 0 is also denoted by r - ro. In the above example, the limit (x, y) -- (0, 0) means that cz2 + y2 - 0 or x2 + y2 -- 0. Therefore, sin(x 2 + y2) sin u -1 as x 2 + y2 =u6-a0. x2 + y2 u Note that here the limit point (0, 0) can be approached from any di- rection in the plane. This is not always so. For example, the do- main of the function f(x, y) = sin(xy)/( c + fy) is the first quad- rant, including its boundaries except the point (0, 0). The points (0, 0) and (-1, -1) are not in the domain of the function. However, the limit of f as (x, y) -- (0, 0) can be defined, whereas the limit of f as (x, y) -- (-1, -1) does not make any sense. The difference between these two points is that any neighborhood of (0, 0) contains points of the domain, while this is not so for (-1, -1). So the limit can be defined only for some special class of points called limit points of a set D. DEFINITION 13.7. (Limit Point of a Set). A point ro is said to be a limit point of a set D if any open ball Na(ro) {r |0 < r - ro < b} (with the center ro removed) contains a point of D. A limit point ro of D may or may not be in D, but it can always be approached from within the set D in the sense that r -a ro and r E D because, no matter how small o is, one can always find a point r E D that does not coincide with ro and whose distance from ro is less than 5. In other words, an intersection of any ball Na(ro) centered at a limit point of D with the set D, denoted as Na(ro) n D, is always  86. LIMITS AND CONTINUITY 175 nonempty. In the above example of D being the first quadrant, the limit (x, y) -- (0, 0) is understood as x2 + y2 - 0 while (x, y) / (0, 0) and x ;> 0, y > 0. The intersection N8 n D is the part of the disk 0 < x2 + y2 <62 that lies in the first quadrant. 86.1. Limits of Functions of Several Variables. DEFINITION 13.8. (Limit of a Function of Several Variables). Let f be a function of several variables whose domain is a set D in a Euclidean space. Let ro be a limit point of D. Then the limit of f (r) as r - ro is said to be a number fo if, for every number e > 0, there exists a corresponding number 8 > 0 such that if r E D and 0 < r -ro| < 8, then |f (r) - fo| < E. In this case, one writes lim f (r) = fo. rarO The number |f(r) - fo| determines the deviation of the value of f from the number fo. The existence of the limit means that no matter how small the number is, there is a neighborhood Na(ro) n D, which contains all points of D whose distance from ro is less than a number b and in which the values of the function f deviate from the limit value fo no more than e, that is, fo -e < f(r) 0. To establish the existence of 8 > 0, note that the inequality 8R3 < c or R < c/2 guarantees that |f(r) - fo| < c. Therefore, for all points r / 0 in the domain of the function for which R < 8 = c/2, the function differs from 0 no more than c. For example, put c 10-6. Then, in the interior of a ball of radius  176 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS z Y Cl fo + c fo b = 6(E,ro) C2 D x FIGURE 13.5. Left: An illustration of Definition 13.8 in the case of a function f of two variables. Given a positive number c, consider two horizontal planes z = fo - c and z = fo+8. Then one can always find a corresponding number o > 0 and the disk N6 centered at ro such that the portion of the graph z = f(r) above the intersection N6 n D lies between the planes: fo -8 < f(r) < fo + E. The radius o > 0 of N6 depends on the choice of c and, generally, on the limit point ro. Right: The independence of the limit of a path along which the limit point ro is approached. For every path leading to ro, there is a part of it that lies in NS. The values of f along this part of the path deviate from fo no more than any preassigned number 8 > 0. 6 = 0.005, the values of the function can deviate from fo = 0 no more than 10-6. Q The radius 6 of a neighborhood in which a function f deviates no more than E from the value of the limit depends on E and, in general, on the limit point ro. EXAMPLE 13.7. Let f(x, y) = xy. Show that lim f (x, y) = zoyo (x,y)-(xo,yo) for any point (xo, yo). SOLUTION: The distance between r = (x, y) and ro = (xo, yo) is R (x - xo)2 + (y - yo)2. Therefore, x - xo < R and y - yo R. Consider the identity zy - eoyo = (x - xo)(y - yo) + xo(y - yo) + (x - xo)yo.  86. LIMITS AND CONTINUITY 177 Put a =(Izo + lyo)/2. Then the deviation of f from the limit value fo = xoyo is bounded as f(x,y)-fo < x - xoy - yo| +|zo||y - yo| + |x - Xo||yo| 0 and assume that R is such that (R + a)2 - a2 <6 or 0 < R < e + a2 - a. Therefore, the function f deviates from fo no more than in a neighborhood of ro of radius = &e + a2 - a, which depends on and the limit point ro. D Remark. The definition of the limit guarantees that if the limit ex- ists, then it does depend on a path along which the limit point may be approached. Indeed, take any path that ends at the limit point ro and fix e > 0. Then, by the existence of the limit fo, there is a ball of radius 8= (6, ro) > 0 centered at ro such that the values of f lie in the interval fo -E < f(r) < fo +e for all points r in the ball and hence for all points of the portion of the path in the ball (see Figure 13.5, right panel). Since e can be chosen arbitrarily small, the limit along any path leading to ro must be fo. This is to be compared with the one-dimensional analog: if the limit of f(x) exists as x -- xo, then the right x -- o and left x - xo limits exist and are equal (and vice versa). 86.2. Properties of the Limit. The basic properties of limits of functions of one variable discussed in Calculus I are extended to the case of functions of several variables. THEOREM 13.2. (Properties of the Limit). Let f and g be functions of several variables that have a common domain. Let c be a number. Suppose that limr-r0 f(r) = fo and limr-r0 g(r) = go. Then the following properties hold: lim (cf (r)) = c lim f (r) = cfo, r) +fr)) - lf lim (g(r) + f (r)) = lim g(r) + lim f (r) = go + fo, r->rpr- ro r- r lim (g(r)f (r)) = lim g(r) lim f (r) = gofo, g(r) limr-r0 g(r) go .. rarO f(r) -limr-r0 f(r) fo The proof of these properties follows the same line of reasoning as in the case of functions of one variable and is left to the reader as an exercise.  178 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS Squeeze Principle. The solution to Example 13.6 employs a rather gen- eral strategy to verify whether a particular number fo is the limit of f (r) as r -+ ro. THEOREM 13.3. (Squeeze Principle). Let the functions of several variables g, f, and h have a common do- main D and let g(r) < f (r) < h(r) for any r E D. If the limits of g(r) and h(r) as r - ro exist and equal a number fo, then the limit of f (r) as r - ro exists and equals fo, that is, g(r) < f(r) < h(r) and lim g(r) = lim h(r) = fo - lim f(r) = fo. PROOF. From the hypothesis of the theorem, it follows that 0 f(r) - g(r) G h(r) - g(r). Put F(r) = f (r) - g(r) and H(r) = h(r) - g(r). Then 0 < F(r) H(r) implies |F(r)| |H(r)| (the positivity of F is essential for this conclusion). By the hypothesis of the theorem and the basic properties of the limit, H(r) = h(r) - g(r) - fo - fo = 0 as r - ro. Hence, for any c> 0, there is a corresponding number S such that 0 < |F(r)| |H(r)| 0, there is an interval 0 < R < 8( ) in which h(R) 0, the corresponding number o is o = V//2. 86.3. Continuity of Functions of Several Variables. Suppose that limr-ro f (r) = fo. If the limit point ro lies in the domain of the function f, then the function has a value f(ro), which may or may not coincide with the limit value fo. In fact, the limit value fo does not generally give any information about the possible value of the function at the limit point. For example, if f(r) = 1 everywhere except one point ro at which f(ro) = c, then, in every neighborhood 0 < ||r - roll < S, f (r) = 1 and hence the limit of f as r - ro exists and equals fo = 1. When c / 1, the limit value does not coincide with the value of the function at the limit point. The values of f suffer a jump discontinu- ity when r reaches ro, and one says that f is discontinuous at ro. A discontinuity also occurs when the limit of f as r - ro does not exist while f has a value at the limit point. DEFINITION 13.9. (Continuity). A function f of several variables with domain D is said to be continuous at a point ro E D if lim f (r) = f (ro) . The function f is said to be continuous on D if it is continuous at every point of D. EXAMPLE 13.9. Let f(x, y) = 1 if y > cc and let f(x, y) = 0 if y < z. Determine the region on which f is continuous. SOLUTION: The function is continuous at every point (zo, Yo) if Yo # cco. Indeed, if yo > zo, then f(zo, yo) = 1. On the other hand, for every such point one can find a neighborhood (x - zo) + (y - yo)2 < &2 (a disk of radius 6 > 0 centered at (o, yo)) that lies in the region y > x. Therefore, If(r) - f (ro)| 1 - 1 =0 < c for any c > 0 in this disk, that is, limre f(r) =f(ro) =1. The same line of reasoning applies to establish the continuity of f at any point (zco, yo), where yo zx), f(r) = 1, whereas in the other part (y < z), f(r) = 0. So, for 0 < < 1, there is no disk of radius 8 > 0 in which |f (r) - f (ro)| =_If (r) - 1| 0 vanishes at the origin ro0 0, f(ro) =0. Put R = cc/z+cj+---+x2. Then Izcl < Rfor any element of the n-tuple. Hence, f (r) - f(ro) = x1|k| cc2| . .. kn <;Rk1+k2+...kn n R N 0  86. LIMITS AND CONTINUITY 181 as R -- 0. By the squeeze principle, f(r) - 0 = f(ro). The rational function f (r)/g(r) is continuous as the ratio of two continuous functions if g(r) /40.Q THEOREM 13.6. (Continuity of a Composition). Let g(u) be continuous on the interval u E [a, b] and let h be a function of several variables that is continuous on D and has the range [a, b]. The composition f(r) = g(h(r)) is continuous on D. The proof follows the same line of reasoning as in the case of the composition of two functions of one variable in Calculus I and is left to the reader as an exercise. In particular, some basic functions studied in Calculus I, sin u, cos u, e", ln u, and so on, are continuous functions on their domains. If f(r) is a continuous function of several variables, the elementary functions whose argument is replaced by f(r) are continuous functions. In com- bination with the properties of continuous functions, the composition rule defines a large class of continuous functions of several variables, which is sufficient for many practical applications. EXAMPLE 13.10. Find the limit m exz cos(xy+z2) r-O X + yz + 3xz4 + (xyz - 2)2 SOLUTION: The function is a ratio. The denominator is a polynomial and hence continuous. Its limit value is (-2)2 = 4 / 0. The function exz is a composition of the exponential e" and the polynomial u xz. So it is continuous. Its value is 1 at the limit point. Similarly, cos(xy+z2) is continuous as a composition of cos u and the polynomial u = xy + z2. Its value is 1 at the limit point. The ratio of continuous functions is continuous and the limit is 1/4. D 86.4. Exercises. (1) Use the definition of the limit to verify each of the following limits (i.e., given c> 0, find the corresponding 8( )): X3 - 4y2x + 5y3 (i) lmr-o 20 . x3 _ 4y2c + 5ys (ii) himr-o 3=c +0Y .cc3 - 4y4 + 5y3cc20 (iii) himr-o 3cc +0y  182 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS x3 - 4y2x + 5y3 (iv) limr-o 3X2 + 4y2 + y4 3x3 + 4y4 - 5z5 (v) himr-o cc=+y+z (V)I1T~r'0X2 + y2 + z2 0 (2) Use the squeeze principle to prove the following limits and find a neighborhood of the limit point in which the deviation of the function from the limit value does not exceed a small given number : (i) limr-o y sin(c/ y) = 0 (ii) limr-o [1 - cos(y/z)] = 0 . cos(xy) sin(4xy) 0 (iii)lhmr-o =0 Hint: | sin u| < |u. (3) Suppose that limrer0 f(r) = 2 and ro is in the domain of f. If nothing else is known about the function, what can be said about the value f(ro)? If, in addition, f is known to be continuous at ro, what can be said about the value f(ro)? (4) Find the points of discontinuity of each of the following functions: (i) f(x, y)= yz/(cx2 + y2) if (x, y) / (0, 0) and f (0, 0) = 1 (ii) f (x, y, z) = yzz/(x2 + y2 + z2) if (x, y, z) / (0, 0, 0) and f (0, 0, 0) = 0 (iii) f (x, y) = sin( x) (iv) f (X, y) = cos( yz)/(cx2y2 + 1) (v) f(x,y) (x2 + y2) ln(x2 + y2) if (x,y) / (0, 0) and f (0, 0) = 0 (vi) f(x, y) = 1 if either x or y is rational and f(x, y) = 0 elsewhere (vii) f (x, y) (X2 -y2)/(x - y) ifx x /y and f(x,cx) =2x (viii) f (x, y) (X2 -y2)/(x - y) ifcx x y and f (x, x) = x (ix) f(x, y, z) = 1/[sin(x) sin(z - y)] (x) f (x, y) = sin - (5) Each of the following functions has the value at the origin f(0, 0) c. Determine whether there is a particular value of c at which the function is continuous at the origin if, for (cc, y) / (0, 0), (i) f(cc, y) =sin (1/(c2 + y2)) (ii) f (x, y) =(xc2 + y2)< sin (1/(xc2 + y2)), ii> 0  87. A GENERAL STRATEGY FOR STUDYING LIMITS 183 (iii) f (x, y) =xym1 sin (1/(X2 + y2)), n1;> 0, m;> 0, and n+m> 0 (6) Use the properties of continuous functions to find the following limits (1 + x + yz2)1/3 2 + 3x - 4y + 5z2 (ii) limr-o sin(zf/y) sinz(xy) cos(x2y) (iv) limr-o [exYz - 2 cos(yz) + 3 sin(xy)] (v) limr-o ln(1+ x2 + y2z2) 87. A General Strategy for Studying Limits The definition of the limit gives only the criterion for whether a number fo is the limit of f(r) as r - ro. In practice, however, a possible value of the limit is typically unknown. Some studies are needed to make an "educated" guess for a possible value of the limit. Here a procedure to study limits is outlined that might be helpful. In what follows, the limit point is often set to the origin ro = (0, 0, ..., 0). This is not a limitation because one can always translate the origin of the coordinate system to any particular point by shifting the values of the argument, for example, lim f(x,y) = lim f(x +czo,y+yo). 87.1. Step 1: Continuity Argument. The simplest scenario in studying the limit happens when the function f in question is continuous at the limit point: lim f (r) = f (ro) . r- ro For example, lim y 2 (x,y)-(1,2) x3 _-y 3 because the function in question is a rational function that is continuous if xc3 _ y2- 0. The latter is indeed the case for the limit point (1, 2). If the continuity argument does not apply, then it is helpful to check the following.  184 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS 87.2. Step 2: Composition Rule. THEOREM 13.7. (Composition Rule for Limits). Let g(t) be a function continuous at to. Suppose that the function f is the composition f (r) = g(h(r)) so that ro is a limit point of the domain of f and h(r) - to as r - ro. Then lim f (r) = lim g(t) = g(to). r-r0 t to The proof is omitted as it is similar to the proof of the composi- tion rule for limits of single-variable functions given in Calculus I. The significance of this theorem is that, under the hypotheses of the theo- rem, a tough problem of studying a multivariable limit is reduced to the problem of the limit of a function of a single argument. The latter problem can be studied by, for example, l'Hospital's rule. It must be emphasized that there is no analog of l'Hospital's rule for multivariable limits. EXAMPLE 13.11. Find cos(xy) - 1 lim. (x,y)-(O,O) x2Y2 SOLUTION: The function in question is g(t) (cost - 1)/t2 for t 0, where the argument t is replaced by the function h~x, y) =xy. The function h is a polynomial and hence continuous. In particular, h(x, y) -- h(0, 0) = 0 as (x, y) -- (0, 0). The function g(t) is continuous for all t / 0 and its value at t = 0 is not defined. Using l'Hospital's rule twice, cos t - 1i -sint = -cost 1 t-wo t2 t-o 2t t-o 2 2 So, by setting g(0) = -1/2, the function g(t) becomes continuous at t = 0, and the hypotheses of the composition rule are fulfilled. Therefore, the two dimensional limit in question exists and equals -1/2. Q 87.3. Step 3: Limits Along Curves. Recall the following result about the limit of a function of one variable. The limit of f(x) as x -- zo exists and equals fo if and only if the corresponding right and left limits of f(x) exist and equal fo: In other words, if the limit exists, it does not depend on the direction from which the limit point is approached. If the left and right limits exist but do not coincide, then the limit does not exist.  87. A GENERAL STRATEGY FOR STUDYING LIMITS 185 For functions of several variables, there are infinitely many paths along which the limit point can be approached. They include straight lines and paths of any other shape, in contrast to the one-variable case. Nevertheless, a similar result holds for multivariable limits (see the second remark at the end of Section 86.1); that is, if the limit exists, then it should not depend on the path along which the limit point may be approached. DEFINITION 13.10. (Parametric Curve in a Euclidean Space). A parametric curve in a Euclidean space is a set of points r(t) (xi(t), x2(t), ..., zx(t)), where xi(t), i = 1, 2, ..., n, are continuous func- tions of a variable t E [a, b]. This is a natural generalization of the concept of a parametric curve in a plane or space as a vector function defined by the parametric equations x =zi(t), i = 1, 2, ..., n. DEFINITION 13.11. (Limit Along a Curve). Let ro be a limit point of the domain D of a function f. Let r(t) = (xi(t), x2(t), ..., xn(t)), t > to, be a parametric curve C in D such that r(t) -- ro as t -- to. Let F(t) = f (r(t)), t > to, be the values of f on the curve C. The limit lim F(t) = lim f (zi(t), x2(t), ..., z,(t)) t-~to t-~to is called the limit of f along the curve C if it exists. Suppose that the limit of f(r) as r - ro exists and equals fo. Let C be a curve such that r(t) - ro as t - to. Fix > 0. By the existence of the limit, there is a neighborhood Na(ro) {r | r E D, 0 < r-ro < S} in which the values of f deviate from fo no more than , If (r) - fo| <&E. Since the curve C passes through ro, there should be a portion of it that lies in Na(ro); that is, there is a number b' such that ||r(t) - roll < b for all t E (to, to + 8'), which is merely the definition of the limit r(t) - ro as t - to. Hence, for any E > 0, the deviation of values of f along the curve, F(t) = f(r(t)), does not exceed e, |F(t) - fo| < E whenever 0 < |t -tol < o'. By the definition of the one-variable limit, this implies that F(t) -- fo as t - to for any curve C through ro. This proves the following. THEOREM 13.8. (Independence of the Limit from a Curve Through the Limit Point). If the limit of f(r) exists as r - ro, then the limit of f along any curve leading to ro from within the domain of f exists, and its value is independent of the curve.  186 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS An immediate consequence of this theorem is a useful criterion for the nonexistence of a multivariable limit. COROLLARY 13.2. (Criterion for Nonexistence of the Limit). Let f be a function of several variables on D. If there is a curve r(t) in D such that r(t) - ro as t - to and the limit limtt+ f(r(t)) does not exist, then the multivariable limit limrnro f(r) does not exist either. If there are two curves in D leading to ro such that the limits of f along them exist but do not coincide, then the multivariable limit limrnro f (r) does not exist. Repeated Limits. Let (x, y) / (0, 0). Consider a path C1 that consists of two straight line segments (x, y) - (x, 0) - (0, 0) and a path C2 that consists of two straight line segments (x, y) - (0, y) - (0, 0). Both paths connect (x, y) with the origin. The limits along C1 and C2, lim (lim f (x, y)) and lim (lim f (x, y)) are called the repeated limits. If C1 and C2 are within the domain of f, then Theorem 13.8 and Corollary 13.2 establish the relations between the repeated limits and the two-variable limit lim(x,y)(0o,o) f(x, y). In particular, suppose that f(x, y) - f(0, y) as x - 0 and f(x, y) -- f(x, 0) as y - 0 (the function is continuous with respect to x if y is fixed and it is also continuous with respect to y if x is fixed). Then the repeated limits become lim f(0, y) and limf (x, 0). If at least one of them does not exist or they exist but are not equal, then, by Corollary 13.2, the two-variable limit does not exist. If they exist and are equal, then the two-variable limit may or may not exist. A further investigation is needed. In general, the segment (x, 0) -- (0, 0) or (0, y) - (0, 0) or both may not be in the domain of f, while the repeated limits still make sense (e.g., the function f is defined only for strictly positive x and y so that the half-lines x = 0, y > 0 and y = 0, x > 0 are limit points of the domain). In this case, the hypotheses of Corollary 13.2 are not fulfilled, and, in particular, the nonexistence of the repeated limits does not imply the nonexistence of the two-variable limit. An example is provided in exercise 1, part (iii). Limits Along Straight Lines. Let the limit point be the origin ro (0, 0, ..., 0). The simplest curve leading to ro is a straight line zcc= vit, where t -~ 0± for some numbers vi, i =1, 2, ..., n, that do not vanish  87. A GENERAL STRATEGY FOR STUDYING LIMITS 187 simultaneously. The limit of a function of several variables f along a straight line, limt-o+ f(vit, v2t, ..., vat), should exist and be the same for any choice of numbers v2. For comparison, recall the vector equation of a straight line in space through the origin: r = tv, where v is a vector parallel to the line. EXAMPLE 13.12. Investigate the two-variable limit . y3 lim 4 (x,y)-(O,O) x4 + 2y4 SOLUTION: Consider the limits along straight lines x = t, y = at (or y = ax, where a is the slope) as t - 0+: as t4 as lim f (t, at) = lim at ) a to+t-wo+ t4(1 + 2a4) 1 + 2a4 So the limit along a straight line depends on the slope of the line. Therefore, the two-variable limit does not exist. D EXAMPLE 13.13. Investigate the limit lim sin( -Xy) (x,y)-(O,O) x + y SOLUTION: The domain of the function consists of the first and third quadrants as xy ;> 0 except the origin. Lines approaching (0, 0) from within the domain are x = t, y = at, a;> 0 and t - 0. The line x = 0, y = t also lies in the domain (the line with an infinite slope). The limit along a straight line approaching the origin from within the first quadrant is lim f(t, at) = limsin(t a__1 = lim a ta)= to+ t-o+ t (1 + a) t-o+ 1 + a 1 + a' where l'Hospital's rule has been used to calculate the limit. The limit depends on the slope of the line, and hence the two-variable limit does not exist. D Limits Along Power Curves (Optional). If the limit along straight lines exists and is independent of the choice of the line, the numerical value of this limit provides a desired "educated" guess for the actual multi- variable limit. However, this has yet to be proved by means of either the definition of the multivariable limit or, for example, the squeeze principle. This comprises the last step of the analysis of limits (Step 4; see below).  188 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS The following should be stressed. If the limits along all straight lines happen to be the same number, this does not mean that the multivari- able limit exists and equals that number because there might exist other curves through the limit point along which the limit attains a different value or does not even exist. EXAMPLE 13.14. Investigate the limit 3 lim y- (x,y)-(0,0) X SOLUTION: The domain of the function is the whole plane with the y axis removed (x - 0). The limit along a straight line ast3 lim f (t, at) = lim = a3 lim t2 = 0 t-0 o+ t-o+ t t-0+ vanishes for any slope; that is, it is independent of the choice of the line. However, the two-variable limit does not exist! Consider the power curve x = t, y = at1/3 approaching the origin as t - 0+. The limit along this curve can attain any value by varying the parameter a: lim f (t, at1/3) _=lima -as t-0+ t-0+ t Thus, the multivariable limit does not exist. D In general, limits along power curves are convenient for studying limits of rational functions because the values of a rational function of several variables on a power curve are given by a rational function of the curve parameter t. One can then adjust, if possible, the power parameter of the curve so that the leading terms of the top and bottom power functions match in the limit t - 0+. For instance, in the example considered, put x = t and y = at" . Then f(t, at") = (a3t3)/t. The powers of the top and bottom functions in this ratio match if 3n = 1; hence, for n = 1/3, the limit along the power curve depends on the parameter a and can be any number. 87.4. Step 4: Using the Squeeze Principle. If Steps 1 and 2 do not apply to the multivariable limit in question, then an "educated" guess for a possible value of the limit is helpful. This is the outcome of Step 3. If limits along a family of curves (e.g., straight lines) happen to be the same number fo, then this number is the sought-after "educated" guess. The definition of the multivariable limit or the squeeze principle can be used to prove or disprove that fo is the multivariable limit.  87. A GENERAL STRATEGY FOR STUDYING LIMITS 189 EXAMPLE 13.15. Find the limit or prove that it does not exist: urn sin(xy2) (x,y)-(O,O) xC2 + y2 SOLUTION: Step 1. The function is not defined at the origin. The continuity argument does not apply. Step 2. No substitution exists to transform the two-variable limit to a one-variable limit. Step 3. Put (x, y) = (t, at), where t - 0+. The limit along straight lines si2ts)1 sin( a2 3/2 lim f(t, at) = lim sin=a lim two+ t-o+ t2 uLo+ u = lim (3/2)a2 1/2 cos(a2U3/2) u-o+ 1 vanishes (here the substitution ut= t2 and l'Hospital's rule have been used to calculate the limit). Step 4. If the two-variable limit exists, then it must be equal to 0. This can be verified by means of the simplified squeeze principle; that is, one has to verify that there exists h(R) such that |f(x, y) - fo| if (X, y)| h(R) -- 0 as R = /2 + y2 -- 0. A key technical trick here is the inequality |sin ul _ lIu, which holds for any real u. One has |sin(xy2)| Izy2| R3 f(xy)-0 - < 22< R2= R 0 X2 + Y2 -X2 + Y2 -R2 where the inequalities Iz| 0, the inequality |f(r) - fo| 2, a similar approach exists. If, for simplicity, ro = (0, 0, ..., 0). Then put x = Rui, where the variables Li satisfy the condition ui+ u2 +.--- + U2 1. For n 2 L1 = cos 0 and L2 = sin 0. For n;> 3, the variables ui can be viewed as the directional cosines, that is, the cosines of the angles between r and unit vectors e parallel to the coordinate axes, u2= r - e/llr||. Then one has to investigate whether there is h(R) such that |f ( Ru1, Ru2, ..., Run) - fo|I < h( R )> 0 , R > 0+ . This technical, often rather difficult, task may be accomplished using the inequalities |uil < 1 and some specific properties of the function f. As noted, the variables ui are the directional cosines. They can also be trigonometric functions of the angles in the spherical coordinate system in an n-dimensional Euclidean space. 87.5. Infinite Limits and Limits at Infinity. Suppose that the limit of a multivariable function f does not exist as r - ro. There are two particular cases, which are of interest, when f tends to either positive or negative infinity. DEFINITION 13.12. (Infinite Limits). The limit of f(r) as r - ro is said to be the positive infinity if, for arny number M > 0, there exists a number 5 > 0 such that f (r ) > M whenever 0 < |r-ro| <5o. Similarly, the limit is said to be the negative infinity if, for any number M < 0; there exists a number 5 > 0 such that f (r ) < M whenever 0 < |r - ro| <5o. In these cases, one writes,  87. A GENERAL STRATEGY FOR STUDYING LIMITS 191 respectively, lim f (r) = 00 and lim f (r) = -oo. rarO rarO For example, 1 lim =0oo. r--O x2 + y2 Indeed, put R = Vz2 + y2. Then, for any M > 0, the inequality f(r) > M can be written in the form R < 1/M. Therefore, the values of f in the disk 0 < ||r|| < = 1/ Mare larger than any preassigned positive number M. Naturally, if the limit is infinite, the function f approaches the infinite value along any curve that leads to the limit point. For example, the limit of f(x, y) = y/(x 2 + y2) as (x, y) -- (0, 0) does not exist because, along straight lines (x, y) = (t, at) approaching the origin when t - 0+, the function f(t, at) = c/t, where c = a/(1 + a2), tends to +oo if a > 0, to -oo if a < 0, and to 0 if a = 0. If, however, the domain of f is restricted to the half-plane y > 0, then the limit exists and equals oc. Indeed, for all x and y > 0, f(x, y) > y/y2 - l/y -0 as y -- 0+, and the conclusion follows from the squeeze principle. For functions of one variable x, one can define the limits at infinity (i.e., when x - +oo or x - -oc). Both the limits have a common property that the distance Iz| of the "infinite points" too from the origin x = 0 is infinite. Similarly, in a Euclidean space, the limit at infinity is defined in the sense that ||r|| - o. If D is an unbounded region, then a neighborhood of the infinite point in D consists of all points of D whose distance from the origin exceeds a number b, r > b. A smaller neighborhood is obtained by increasing 8. DEFINITION 13.13. (Limit at Infinity). Let f be a function on an unbounded region D. A number fo is the limit of a function f at infinity, lim f (r) = fo r-- oo if, for any number e > 0, there exists S > 0 such that |f (r) - fo| < e whenever ||r|| > 5 in D. Infinite limits at infinity can be defined similarly. The squeeze prin- ciple has a natural extension to the infinite limits and limits at infinity. For example, if g(r) <; f(r) and g(r) -e 00 as r -a ro (or r -a 00), then f(r) - oo.  192 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS 87.6. Study Problems. Problem 13.1. Find the limit limr-ro f(r) or show that it does not exist, where f (r) = f(x, y, z) =_(x2 + 2y2 + 4z2) ln(x2 + y2 + z2) , ro = (0, 0, 0). SOLUTION: Step 1. The continuity argument does not apply because f is not defined at ro. Step 2. No substitution is possible to transform the limit to a one- variable limit. Step 3. Put r(t) = (at, bt, ct) for some constants a, b, and c that do not vanish simultaneously so that they define the direction of the line through the origin. Then f(r(t)) = At2In(Bt2) = 2At2Int + At2In B, where A = a2 + 2b2 + 4c2 and B = a2 + b2 + c2 > 0. By l'Hospital's rule, ln t t-1 1 lim t2 In t = lim I = lim t =-- lim t = 0, t-o+t-+ t-2 t-+ -2t-3 2 t--o+ ' and therefore f(r(t)) - 0 as t - 0+. So, if the limit exists, then it must be equal to 0. Step 4. Put R2 = x2 + y2 + z2. Since the limit R - 0+ is of interest, one can always assume that R < 1 so that ln R2 = 2 ln R < 0. By making use of the inequalities Iz| < R, ly| R, and Iz| < R, one has R2 < X2 + 2y2 + 4z2 < 7R2. By multiplying the latter inequality by ln R2 < 0, R21n R2 ;> f(r) > 7R2ln(R2). Since tln t - 0 as t = R2 - 0+, the limit exists and equals 0 by the squeeze principle. D Problem 13.2. Prove that the limit limr-ro f (r) exists, where 1 - cos~x2y) f (r) = f (x, y) = y2 ,ro = (0, 0), X2 + 2y2 and find a disk centered at ro in which values of f deviate from the limit by more than ec= 0.5 x 10-4. SOLUTION: Step 1. The continuity argument does not apply because f is not defined at ro. Step 2. No substitution is possible to transform the limit to a one- variable limit.  87. A GENERAL STRATEGY FOR STUDYING LIMITS 193 Step 3. Put r(t) (t, at). Then urn f (r (t)) rn1 - cos(at3) 1 1 - cos(au3/2) t-w+ t2(1 + 2a2) 1 + 2a2 a-w+ L 1 +ai/2 sin(au3/2) - 1 + 2a2 240+ 1 ' where the substitution uL= t2 and l'Hospital's rule have been used to evaluate the limit. Therefore, if the limit exists, it must be equal to 0. Step 4. Note first that 1 - cosu = 2sin2(u/2) < 2/2, where the inequality |sin x < Iz| has been used. Put R2 = x2 + y2. Then, by making use of the above inequality with u = x2y together with Ix 0 small enough because tan(20) is not bounded in this interval. Hence, for any  194 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS > 0, there is no S > 0 such that |f(r)| < whenever r E D lies in the disk 0 < ||r - roll < 5. Thus, the limit does not exist. Step 3 (Optional). The nonexistence of the limit established in Step 4 implies that there should exist curves along which the limit differs from 0. It is instructive to demonstrate this explicitly. Any such curve should approach the origin from within one of the narrow sectors con- taining the lines y =+tx (where tan(20) takes large values). So put, for example, r(t) = (t, t - at"), where n > 1 and a / 0 is a number. Observe that the line L(t) = (t, t) (or y = z) is tangent to the curve r(t) at the origin because r'(0) = (1, 1) for n > 1. The term -at" in r(t) models a small deviation of the curve from the line y = x in the vicinity of which the function f is expected to be unbounded. Then f(r(t)) = (t3 _ tn+2a+1-a2t2n). This function tends to a number as t - 0+ if n is chosen to match the leading (smallest) powers of the top and bottom of the ratio in this limit (i.e., 3 = n+1 or n = 2). Thus, for n = 2, f (r(t)) = (t3 - at4)/(2at3 - a2t4) =(1 - at2)/(2a - a2t) - 1/(2a) as t - 0+ and f(r(t)) diverges for n> 2 in this limit. Q Problem 13.4. Find lim ln(x2 + y4) r-oo X2 + 2y2 or show that the limit does not exist. SOLUTION: Step 1. Does not apply. Step 2. No substitution exists to reduce the limit to a one-variable limit. Step 3. Put (x, y) (t, at) and let t - oc. Then r| - oc as t - oc. One has f(t, at) = ln(t2 + a4t4)/(t2 + 2a2t2). For large values of t, ln(t2 + a4t4) ln(a4t4) = ln(t4) + ln(a4) 4ln t if a / 0 and f (t, 0) = 2ln t/t2. Therefore, f (t, at) behaves as ln t/t2 - 0 as t -o0 (by l'Hospital's rule). So the limit along all straight lines is 0. Step 4. Put R = /2 2+y2 so that Iz| 1. The denominator of the ratio f can be estimated from below: x2 + 2y2 =cX2 + y2 + y2 = R2 + y2 > R2. Hence, for R > 1, ln(4R4) _4 ln R+1ln4 f(x,y) -0| _ 2R - 0 as R - 0o. Thus, by the squeeze principle the limit is indeed 0. F-I  87. A GENERAL STRATEGY FOR STUDYING LIMITS 195 87.7. Exercises. (1) Prove the following statements: (i) Let f(x,y) = (x-y)/(x+ y). Then lim lim f (x, y) x-0 yW 0 I = 1 , lim (lim f (x, y)) -1, but the limit of fv(x, y) as (x, y) - (0, 0) does not exist. (ii) Let f(x, y) = x2y2/(x2y2 + (x - y)2). Then lim (uimf(xy) lim (lim f(x, y)) y-0 \x-0 0, but the limit (iii) Let f (x, y) = of f (x, y) as (x, y) - (0, 0) does not exist. (x + y) sin(1/x) sin(1/y). Then the limits lim (lim f(x,y) x-0 y/0 and lim (limf(x, y)) y 0 \x-0 / do not exist, but the limit of f (x, y) exists and equals 0 as (x, y) - (0, 0). Does the result contradict Theorem 13.8? Explain. (2) Find each of the following limits or show that it does not exist: cos(xy + z) r-o X4 + y2z2 + 4 22+ 1- 1 (iii) lim 2 x3_+ 5 (v) lim r-ox2 +2y2 (vii) hm r-o x2 + 2y2 (ix) lim ln(x + eY) (x,y)- (1,0) cX2 _ y2 (xi) limo- tan ( (xiii) h2m ccy+1-1 r-oo y c (ii) lim sin(cc) - cy ro(ccy)3 sin(xy3) (iv) lim 2 r-o ||2 (viii) .2 y2 + x sin(xy) r-o x2 + 2y2 (x) lim(x2 + y2)Xzy (xi) r- 111O 2 -2 \2 (i)im sin(X2_y2)) 2 ba (xiv) lim ln (xiv) hm b 0 < b < a r~oo za-|-yb (3) Let f(0, 0) f(x, y) = (Iz| + ly| - |x + yl)/(x2 + y2)k, if x2 +y2# 0 and = c. Find all values of constants c and k > 0 at which the function is continuous at the origin. (4) Let f(x, y) = x2y/(z4 + y2) if x2 + y2 / 0 and f (0, 0) = 0. Show that f is continuous along any straight line through the origin; that is, F(t) = f(z(t), y(t)) is continuous for all t, where x(t) = t cosO0,  196 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS y(t) = t sin 6 for any fixed 0, but f is not continuous at (0, 0). Hint: Investigate the limits of f along power curves leading to the origin. (5) Let f(x, y) be continuous in a rectangle a < x < b, c < y < d. Let g(x) be continuous on the interval (a, b) and take values in (c, d). Prove that the function F(x) = f(x, g(x)) is continuous on (a, b). (6) Investigate the limits of the function fv(x, y) x= 2e-x2-y) along the rays x(t) = cos(0)t, y(t) = sin(0)t as t - oc for all 0 < 0 < 27. Are the values of the function arbitrarily small for all ||r|| > 5 if 5 is large enough? Does the limit lim-rc f(x, y) exist? (7) Find the limit or show that it does not exist: lsin(x2 +y2 +z2) . x2+2y2 + 3z2 (i) liJihm 4+2J2 roim 4 +y 4 + z4 r-O x4+y2z4 ln(x2y2z2) e3x2+22+z2 r-oiX2 + y2 + z2 r-o (x2 + 2y2 + 3z2)2012 zz (v) lim (vi) lim +z if z< 0 r-Ox2 +y2 + z2 rim2 + y2 + z2 x2 + y2..r (vi) him 2 (viii) lim sin r-oo x2 + y4 r-o 2x +y (ix) lim (x2 + y2)e-1x+ 1 (x) lim(X< )x re 2 __ 2 r-oo r-oox+g (8) Find the repeated limits lim (lim logy (x+y)) and lim(lim log (x + y) x-l \y-o0/y-o0\x-1 What can be said about the corresponding two-variable limit? 88. Partial Derivatives The derivative f'(xo) of a function f(x) at x = zo contains impor- tant information about the local behavior of the function near x = zo. It defines the slope of the tangent line L(x) =_f(zo) + f'(zo)(x - zo), and, for x close enough to zo, values of f can be well approximated by the linearization L(x), that is, f(x) ~L(x). In particular, if f'(xo) > 0, f increases near zo, and, if f'(xo) < 0, f decreases near zo. Further- more, the second derivative f"(zo) supplies more information about f near zo, namely, its concavity. It is therefore important to develop a similar concept for functions of several variables in order to study their local behavior. A significant difference is that, given a point in the domain, the rate of change is going to depend on the direction in which it is measured. For example, if f(r) is the height of a hill as a function of position r, then the slopes from west to east and from south to north may be different. This  88. PARTIAL DERIVATIVES 197 observation leads to the concept of partial derivatives. If x and y are the coordinates from west to east and from south to north, respectively, then the graph of f is the surface z = f(x, y). At a fixed point ro (zo, yo), the height changes as h(x) = f(x, yo) along the west-east direction and as g(y) = f (xo, y) along the south-north direction. Their graphs are intersections of the surface z = f(x, y) with the coordinate planes x =x0 and y = yo, that is, z = f(xo, y) = g(y) and z f(x, yo) = h(x). The slope along the west-east direction is h'(xo), and the slope along the south-north direction is g'(yo). These slopes are called partial derivatives of f and denoted as 64 d (o,yo) dfY(x,yo) , OX dz (, ox=xo of d (o, Yo) = f (xoy) . The partial derivatives are also denoted as (o, yo) _f (xo, yo), (o, yo) f,(xo, yo). The subscript of f' indicates the variable with respect to which the derivative is calculated. The above analysis of the geometrical signif- icance of partial derivatives is illustrated in Figure 13.6. The concept of partial derivatives can easily be extended to functions of more than two variables. 88.1. Partial Derivatives of a Function of Several Variables. Let D be a subset of an n-dimensional Euclidean space. DEFINITION 13.14. (Interior Point of a Set). A point ro is said to be an interior point of D if there is an open ball B(ro) = {r |||r -ro|| <5S} of radius S that lies in D (i.e., B(r) C D). In other words, ro is an interior point of D if there is a positive number b > 0 such that all points whose distance from ro is less than b also lie in D. For example, if D is a set points in a plane whose coordinates are integers, then D has no interior points at all because the points of a disk of radius 0 < a < 1 centered at any point ro of D do not belong to D except ro. If D = {(x, y)| 2+ y2 <; 1}, then any point of D that does not lie on the circle x2+ y2 = 1 is an interior point. DEFINITION 13.15. (Open Sets). A set D in a Euclidean space is said to be open if all points of D are interior points of D.  198 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS z f (x, yo) z= f zXo, y) z (X, y) f(X'Y z-fo~y z (xy)z-=f(x, yo) z-zf(o, y) P x V Q x X0 X0 Yo y = yo tan Ox = fx(xo, yo) tan Oy = fy(xo, yo) FIGURE 13.6. Geometrical significance of partial deriva- tives. Left: The graph z = f(x, y) and its cross sec- tions by the coordinate planes x = xo and y = yo. The point Q (xo, yo, 0) is in the domain of f and the point P (xo, yo, f(xo, yo)) lies on the graph. Middle: The cross section z = f(x, yo) of the graph in the plane y = yo and the tangent line to it at the point P. The slope tan Ox of the tangent line is determined by the partial derivative fx(zo, yo) at the point Q. Right: The cross section z = f(xo, y) of the graph in the plane x = xo and the tangent line to it at the point P. The slope tan 0Y of the tangent line is determined by the partial derivative f'(xo, yo) at the point Q. Here 0Y < 0 as it is counted clockwise. An open set is an extension of the notion of an open interval (a, b) to the multivariable case. In particular, the whole Euclidean space is open. Recall that any vector in space may be written as a linear combi- nation of three unit vectors, r = (x, y, z) = x6 + Ye2 + ze3, where ei = (1, 0, 0), 82 = (0, 1, 0), and e3 = (0, 0, 1). Similarly, using the rules for adding n-tuples and multiplying them by real numbers, one can write r = (XI, x2, ..., xn) = x11 + x22 + ...+ xnen, where e is the n-tuple whose components are zeros except the ith one, which is equal to 1. Obviously, ||e&|| = 1, i = 1, 2, ..., n. DEFINITION 13.16. (Partial Derivatives at a Point). Let f be a function of several variables (z1, z2, ..., xn). Let D be the domain of f and let ro be an interior point of D. If the limit f (ro) = lim f(ro + he) - f(ro) h->o h exists, then it is called the partial derivative of f with respect to x at ro.  88. PARTIAL DERIVATIVES 199 The reason the point ro needs to be an interior point is simple. By the definition of the one-variable limit, h can be negative or positive. So the points ro + het, i= 1, 2, ..., n, must be in the domain of the function because otherwise f(ro + het) is not even defined. This is guaranteed if ro is an interior point because all points r in the ball B6(ro) of sufficiently small radius b = lh| are in D. Remark. It is also common to omit "prime" in the notations for par- tial derivatives. For example, the partial derivative of f with respect to x is denoted as fx. In what follows, the notation introduced in Def- inition 13.16 will be used. Let ro = (ai, a2, ..., an), where a2 are fixed numbers. Consider the function F(xi) of one variable x2 (i is fixed), which is obtained from f(r) by fixing all the variables x = a3 except the ith one (i.e., x= a3 for all j - i). By the definition of the ordinary derivative, the partial derivative f'. (ro) exists if and only if the derivative F'(a2) exists because F(a2 + h) - F(a2) dF(zv) (13.1) f', (ro) = lim= h-0 h dzi x2=a2 just like in the case of two variables discussed at the beginning of this section. This rule is practical for calculating partial derivatives as it reduces the problem to computing ordinary derivatives. EXAMPLE 13.16. Find the partial derivatives of fv(x, y, z) = x3-y2z at the point (1,2,3). SOLUTION: By the rule (13.1), d _2d f'(1,2,3) = f(x,2,3) d(x3 - 12) = 3, dz x=1 dz x=1 d _d f'(1,2,3) = (1- 3y2) =-12, f(,3)ddy y=2 dy y=2 d d f(1, 2, 3) = f (1, 2, z) =3 (1 - 4z) = -4. Jzdz~ z=3 dzz= Geometrical Significance of Partial Derivatives. From the rule (13.1), it follows that the partial derivative f' (ro) defines the rate of change of the function f when only the variable zi changes while the other variables are kept fixed. If, for instance, the function f in Example 13.16 defines the temperature in degrees Celsius as a function of the position whose coordinates are given in meters, then, at the point (1, 2, 3), the temperature increases at rate of 4 degrees Celsius per meter in the  200 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS direction of the x axis, and it decreases at the rates -12 and -4 degrees Celsius per meter in the direction of the y and z axes, respectively. 88.2. Partial Derivatives as Functions. Suppose that the partial deriva- tives of f exist at all points of a set D. Then each partial derivative can be viewed as a function of several variables on D. These functions are denoted as f' (r), where r E D. They can be found by the same rule (13.1) if, when differentiating with respect to xi, all other variables are not set to any specific values but rather viewed as independent of xi (i.e., dx/dx = 0 for all j - i). This agreement is reflected by the notation a af' i(zI, x2, ..., zn) = f (zi, x2, ..., zn); axi that is, the symbol &/&xe means differentiation with respect to x2 while regarding all other variables as numerical parameters independent of xi. EXAMPLE 13.17. Find f'(x, y) and f'(x, y) if f (x, y) = x sin(xy). SOLUTION: Assuming first that y is a numerical parameter indepen- dent of x, one obtains f'(x, y) = f (x, y) (]x) sin(xy) + Xasin(xy) = sin(xy) + xy cos(xy) by the product rule for the derivative. If now the variable x is viewed as a numerical parameter independent of y, one obtains f (xy) =- f (xy) = X_ sin(xy) = X2 cos(xy). 88.3. Basic Rules of Differentiation. Since a partial derivative is just an ordinary derivative with one additional agreement that all other variables are viewed as numerical parameters, the basic rules of dif- ferentiation apply to partial derivatives. Let f and g be functions of several variables and let c be a number. Then (cf)=c , (f+g= + , a ofa g a f 9f_ _ (fg)= 9+f ; ax (j) g-f a , Let h~u) be a differentiable function of one variable and let g(r) be a function of several variables whose range lies in the domain of f. Then  88. PARTIAL DERIVATIVES 201 one can define the composition f (r) - h(g(r)). Assuming that the partial derivatives of g exist, the chain rule holds (13.2) Of =h'(g) EXAMPLE 13.18. Find the partial derivatives of the function f (r) S L T O : P t h u = u - 2 an r) = x + X2 + """+ X = T hen f (r) = h (g (r)). Since h'(u) (- 1/2) 2.63/2 and & g/& xj = 2cvZ, the chain rule gives r - i 88.4. Exercises. (1) Find the specified partial derivatives of each of the following func- tions: (i) f (x, y) - (xc - y) /(x + y), fxl(1, 2), fyl(1, 2) (ii) f (x,y, z) - (xy-iz)/(z-iy), f(1,2,3), f(1,2,3), f(1,2,3) (v) f (X, Y) =c + (y - 1) sin-1( x/y), f(1, 1), fyl(1, 1) (vi) f (X, y) =(Xc3 + y3)1/3, fxl(O, 0), fyl(0, 0) (vii) f~i,y) x-yI, f X/(0, 0), f(0,0) (2) Find the partial derivatives of each of the following functions: (i) f(x,y)- (x+y2= & . Thus, the conditions (13.4) on the functions Fi must be fulfilled; oth- erwise, f satisfying (13.3) does not exist. The conditions (13.4) are called integrability conditions for the system of equations (13.3). EXAMPLE 13.21. Suppose that fXj(,y) = 2c + y and f'(x, y) 2y - z. Does such a function f exist? SOLUTION: The first partial derivatives of f, F1(x, y) = 2x + y and F2(x, y) = 2y - x, are polynomials, and hence their derivatives are continuous in the entire plane. In order for f to exist, the integrability condition &F1/&y =&F2/&cc must hold in the entire plane. This is not so because &F1/&y =1, whereas &F2/&cc= -1. Thus, no such f exists.D Suppose now that the integrability conditions (13.4) are satisfied. How is a solution f to (13.3) to be found? Evidently, one has to  89. HIGHER-ORDER PARTIAL DERIVATIVES 205 calculate an antiderivative of the partial derivative. In the one-variable case, an antiderivative is defined up to an additive constant. This is not so in the multivariable case. For example, let f'(x, y) = 3x2y. An antiderivative of f' with respect to x is a function whose partial derivative with respect to x is 3x2y. It is easy to verify that x3y satisfies this requirement. It is obtained by taking an antiderivative of 3x2y with respect to x while viewing y as a numerical parameter independent of x. Just like in the one-variable case, one can always add a constant to an antiderivative, x3y+c and obtain another solution. The key point to observe is that the integration constant may be a function of y! Indeed, (x3y + g(y))' = 3x2y. Thus, the general solution of f'(x, y) = 3x2y is f (x, y) = x3y + g(y) for some g(y). If, in addition, the other partial derivative f' is given, then an explicit form of g(y) can be found. Put, for example, f,(x, y) = x3 + 2y. The integrability conditions are fulfilled: (f, = (3x2y), = 3x2 and (f' (x3 + 2y) = 3x2. So a function with the said partial derivatives does exist. The substitution of f(x, y) = x3y + g(y) into the equation f' = x3 + 2y yields x3 + g'(y) = x3 + 2y or g'(y) = 2y and hence g(y) = y2 + c. Note the cancellation of the x3 term. This is a direct consequence of the fulfilled integrability condition. Had one tried to apply this procedure without checking the integrability conditions, one could have found that, in general, no such g(y) exists. In Example 13.21, the equation f' = 2x + y has a general solution f(X, y) =X2 + yc + g(y). Its substitution into the second equation f' = 2y - x yields x+g'(y) = 2y - x or g'(y) = 2y - 2x. The derivative of g(y) cannot depend on x and hence no such g(y) exists. EXAMPLE 13.22. Find f(x, y, z) if f' = yz + 2x = F1, f c=cz + 3y2 = F2, and f' = zy + 4z3 = F3 or show that it does not exist. SOLUTION: The integrability conditions (F1)'<= (F2)', (F1)z = (F3)/, and (F2)'z = (F3)'z are satisfied (their verification is left to the reader). So f exists. Taking the antiderivative with respect to c in the first equation, one finds f> = yz + 2x - f(x, y,z) = zyz + x2 + g(y,z), for some g(y, z). The substitution of f into the second equations yields f' zzz+3y2 _> cz+g'(y,z) =z+ 3y2 -->~ g' (y, z) =3y2 --> g(y, z) = y3 +h(z) -->f (xc,y,z)=czyz +x2 + y3 +h(z),  206 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS for some h(z). The substitution of f into the third equation yields f' = xy +4z3 - xy + h'(z) = zy + 4z3 -> h'(z) = 4z3 -> h(z) = z4 + c ->f(x,y,z) = xyz+x2 +y3+z4+c, where c is a constant. D The procedure of reconstructing f from its first partial derivatives as well as the integrability conditions (13.4) will be important when discussing conservative vector fields and the potential of a conservative vector field. 89.2. Partial Differential Equations. The relation between a function of several variables and its partial derivatives (of any order) is called a partial differential equation. Partial differential equations are a key tool to study various phenomena in nature. Many fundamental laws of nature can be stated in the form of partial differential equations. Diffusion Equation. Let n(r, t), where r = (x, y, z) is the position vector in space and t is time, be a concentration of a substance, say, in air or in water or even in a solid. Even if there is no macroscopic motion in the medium, the concentration changes with time due to thermal motion of the molecules. This process is known as diffusion. In some simple situations, the rate at which the concentration changes with time at a point is n'= nk(n" +" n' where the parameter k is a diffusion constant. So the concentration as a function of the spatial position and time must satisfy the above partial differential equation. Wave Equation. Sound in air is propagating disturbances of the air density. If u(r, t) is the deviation of the air density from its constant (nondisturbed) value uo at the spatial point r = (x, y, z) and at time t, then it can be shown that small disturbances u/uo « 1 satisfy the wave equation: 1l". 2 (1l" +1l." 1 where c is the speed of sound in the air. Light is an electromagnetic wave. Its propagation is also described by the wave equation, where c is the speed of light in vacuum (or in a medium, if light goes through a medium) and u is the amplitude of electric or magnetic fields.  89. HIGHER-ORDER PARTIAL DERIVATIVES 207 Laplace and Poisson Equations. The equation where f is a given nonzero function of position r = (x, y, z) in space, is called the Poisson equation. In the special case when f = 0, this equation is known as the Laplace equation. The Poisson and Laplace equations are used to determine static electromagnetic fields created by static electric charges and currents. EXAMPLE 13.23. Let h(q) be a twice-differentiable function of a variable q. Show that u(r, t) = h(ct - n - r) is a solution of the wave equation for any fixed unit vector n. SOLUTION: Let n = (n1, n2, n3), where nm + n + nl = 1 as niis the unit vector. Put q = ct - n - r = ct - nix - n2Y - n3z. By the chain rule (13.2), u'= q'h'(q) and similarly for the other partial derivatives. Thereforeu' = ch' (q), u"' = c2 h"(q),u', -nih'(q),u",' = n2 h"(q), and, in the same fashion, u' , = nth"(q), and uzz = n3 h"(q). Then n'' + u', + uzz = (n,2 + n2 + n3)h"(q) = h"(q), which coincides with n''/c2, meaning that the wave equation is satisfied for any h. D Consider the level surfaces of the solution of the wave equation discussed in this example. They correspond to a fixed value of q = q0. So, for each moment of time t, the disturbance of the air density u(r, t) has a constant value h(qo) in the plane n - r = ct - go = d(t). All planes with different values of the parameter d are parallel as they have the same normal vector n. Since here d(t) is a function of time, the plane on which the air density has a fixed value moves along the vector n at the rate d'(t) = c. Thus, a disturbance of the air density propagates with speed c. This is the reason that the constant c in the wave equation is called the speed of sound. Evidently, the same line of reasoning applies to electromagnetic waves; that is, they move through space at the speed of light. The speed of sound in the air is about 342 meters per second, or about 768 mph. The speed of light is 3 - 108 meters per second, or 186 miles per second. If a lightning strike occurs a mile away during a thunderstorm, it can be seen almost instantaneously, while the thunder will be heard about 5 seconds later. Conversely, if one sees a lightning strike and starts counting seconds until the thunder is heard, then one could estimate the distance to the lightning. The sound travels 1 mile in about 4.7 seconds.  208 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS 89.3. Study Problems. Problem 13.5. Find the value of a constant a for which the function u(r, t) =t-3/2e-ar2/t r satisfies the diffusion equation for all t > 0. SOLUTION: Note that u depends on the combination r2 = x2+y2 + z2. To find the partial derivatives of u, it is convenient to use the chain rule: &u &u &r2 3u 2ax - 2-- 2x =--U Ox or2 Ox or2 t &/ (&u 2a 2axou (2a 4a2x2'\ " Ox x t t Ox t t2 To obtain u', and uzz, note that r2 is symmetric with respect to per- mutations of x, y, and z. Therefore, u'' and u'z'z are obtained from n' by replacing, in the latter, x by y and x by z, respectively. Hence, the right side of the diffusion equation reads k((~+u~ +, ( ka 4ka2r2 kn ,'+n',Y+zz)=_ t t2 U Using the product rule to calculate the partial derivative with respect to time, one finds for the left side '--t-5/2-ar 2/+t-/2ear-2ar 2/t2t+ar2 Since both sides must be equal for all values of t > 0 and r2, the comparison of the last two expressions yields two conditions: 6ka 3/2 (as the equality of the coefficients at 1/t) and a = 4ka2 (as the equality of the coefficients at r2/t2). The only common solution of these conditions is a = 1/(4k). D Problem 13.6. Consider the function f (xy) = X 2Y if (x, y) (0, 0) and f (0, 0) = 0. n2 + y1 Find f;Qx, y) and f'(x, y) for (x,y) / (0,0). Use the rule (13.1) to find f'(0, 0) and f'(0, 0) and, thereby, to establish that f; and f'exist everywhere. Use the rule (13.1) again to show that f;i'y(0, 0) =-1 and f",(0,0) =1, that is, f;'(0,0) / f"%,0,0). Does this result contradict Clairaut's theorem?  89. HIGHER-ORDER PARTIAL DERIVATIVES 209 SOLUTION: Using the quotient rule for differentiation, one finds fx4y + 4x2y3 _ y z5 -4c3y2_y4 - (c2 + p2)2 J~~ ~(c2 + y2)2 if (x, y) / (0, 0). Note that, owing to the symmetry f(x, y) = -f (y, z), the partial derivative f' is obtained from f' by changing the sign of the latter and swapping x and y. The partial derivatives at (0, 0) are found by the rule (13.1): f'(0, 0) = f(x, 0) = 0, f'(0, 0) = f(0,) = 0. dz 2=o dy Y=o The first-order partial derivatives are continuous functions (the proof is left to the reader as an exercise). Next, one has d ,___f'(0,h)-f'(0,0) f2(0, 0) d f,(0, y) =limf(0h)-f(,) dy Y=o h-o h -h-0 = lim = -1 h-O h f(0, 0) d f(x, 0) lim f(h,0)-f(0,0) dx 2=o h- Oh h-0 = lim h 0 1. h-O h The result does not contradict Clairaut's theorem because f ',(x, y) and f"(x, y) are not continuous at (0, 0). By using the quotient rule to differentiate f'(x, y) with respect to y, an explicit form of f"y (x, y) for (x, y) / (0, 0) can be obtained. By taking the limit of f'(x, y) as (x, y) - (0, 0) along the straight line (x, y) = (t, at), t - 0, one infers that the limit depends on the slope a, and hence the two-dimensional limit does not exist, that is, lim(XY)-(o,o) f['(x, y) f['(0, 0) = -1, and f ', is not continuous at (0, 0). The technical details are left to the reader. D 89.4. Exercises. (1) Find all second partial derivatives of each of the following functions and verify Clairaut's theorem: (i) f(x, y) = tan-1 cy (ii) f(cc, y, z) =cc sin(zy2) (iii) f (cc,y, z) =cc3+-zy -Fz2 (iv) f(cc, y, z) =(cc + y)/(cc + 2z) (v) f (c,y) =cos-1( c/) (vi) f (xc,y) =cxY  210 210 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (2) Explain without explicit calculation of higher-order partial deriva- tives that the hypothesis of Clairaut's theorem is satisfied for the fol- lowing functions (i) f(x, y, z)= sin(x2 + y - z) cos(xy) (ii) f(x, Y)= sin(x + y2)/(x2 + y2), Xc2 + y 2 0 (iii) f(x, Y, z)= ex 2Yz(y 2 + zx 4) (iV) f (x, y)= ln(1 + X2 + y4)/(Xc2 - y2), xc2 ,y (V) f (, y,z) -(X + yz2 - cz5)/(l + c2y2z4) (3) Find the indicated partial derivatives of each of the following func- tions: (ii) f(cc, y, z) _cc cos(ycc) + z3, fZ f/§llz yy ( 1) f( ,Y z) =51( y e o / 5 ( ((iv) f ( , y, z, t) = sin(cc + 2y + 3z - 4t), fd where abcd denotes all permutations of ccyzt (v) f (x, y) - exY(y2 + cc), f bd~ where abcd denotes all permuta- tions of cccccy (vi) f (x, y, z)= tan-11 x x z fab, where abc denotes all permutations of ccyz (vii) f (x, y, z, t) l n ((cc - y)2 + (z - t)2)-1/2, fabd, where abcd are all permutations of ccyzt. (viii) f(c, y) exsin(y), aa4(0,0) (4) Given partial derivatives, find the function or show that it does not exist: (i) fx =3cc2y, A' cc 3 + 3y2 (ii) f/' = yz + 3x2, fy cz + 4y, fz =xy + 1 (iv) f/1 = cy + z, fA1 -9/2 fz x + y (v) fx sin(ccy) + ccy cos(ccy), fA' cc2 cos(ccy) + 1 (5) Verify that a given function is a solution of the indicated differential equation: (i) f(t, x) =Asin(ct - x) + B cos(ct + cc), c-2 ft - fxl' =0 (ii) f(c,y,t) g (ct-acc-by)+h(ct+ac+byJ), ftt =c2(fjx+f y) if a2 + b2= 1 and g and h are twice differentiable functions  90. LINEARIZATION OF MULTIVARIABLE FUNCTIONS 211 (6) Find a relation between the constants a, b, and c such that the function u(x, y, t) = sin(axc+by+ct) satisfies the wave equation 's -z'4' -n' 0. Give a geometrical description of such a relation, for example, by setting values of c on a vertical axis and the values of a and b on two horizontal axes. (7) Let f(x, y, z) = u(t), where t = zyz. Show that f 3)zj= F(t) and find F(t). (8) Find (f')2+(f')2+(f')2 and f", + f"y + fz if (i) f = x3 +y3 +z3 - 3xyz (ii) f = (X2 + y2 + z2)-1/2 (9) Let the action of K on a function f be defined by Kf =xfx +yf'. Find Kf, K2f = K(Kf), and K3f = K(K2f) if (i) f =z/(x2 + y2) (ii) f = In x2 + y2 (10) Let f(x, y) = zy/(x2 + y2) if (x, y) / (0, 0) and f(0, 0). Does f"'(0,0) exist? (11) If f = f(x, y) and g = g(x, y, z), solve the following equations: (i) f" = 0 (ii) f" = 0 (iii) T"f /&y" = 0 (iv) g""z = 0 (12) Find f(x, y) that satisfies: (i) f' = X2 + 2y, f(X, x2) 1 (ii) f" = 4, f (x , 0) = 2, f'l(x, 0) = x (iii) f"11 = x + y, f (X, 0) = X, f (0, y) = y 2 90. Linearization of Multivariable Functions A differentiable one-variable function f(x) can be approximated near x = o by its linearization L(x) = f(xo) + f'(xo)(x - zo) or the tangent line. Put x = zo+Az. Then, by the definition of the derivative f'(zo), limf(x)-L(x) = lim f(o+A)-f(o) -f(zo) Ax--- Acx Ax-W Acx This relation implies that the error of the linear approximation goes to 0 faster than the deviation Ac =cx - zo0 of cc from cco, that is, (13.5) f(xc) =L(xc)+(Azc) A, where c(Az) -0 as Acc- 0 .  212 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS For example, if f (x) = x2, then its linearization at x = 1 is L(x) = 1 + 2(x - 1). It follows that f(1 + Ax) - L(1 + Ax) (Ax)2 or c(Ax) =Ax. Conversely, consider a line through the point (xo, f(xo)) and assume that the condition (13.5) holds. If n is the slope of the line, then L(x) = f(xo) + n(x - xo) f(xo) + nAx and f(x) - L(x) _ l_ f(xo + Oz) - f(xo) lim = _li - n_ 0. Ax-W A0x A- x By the definition of the derivative f'(xo), the existence of this limit implies the existence of f'(xo) and the equality n = f'(xo). Thus, among all linear approximations of f near xo, only the line with the slope n = f'(xo) is a good approximation in the sense that the error of the approximation decreases faster than Ax with decreasing Ax, and the very existence of a good linear approximation at x = zo is equivalent to differentiability of f at xo. For example, the function f(x) xI is not differentiable at x = 0. A good linear approximation does not exist at xo = 0. Indeed, here Ax= x and L(x) = nx. Hence, (f(x) - L(x))/AOx= (Ix - nx)/x x Izl/x - n, and no number n exists at which this difference vanishes in the limit x - 0. 90.1. Differentiability of Multivariable Functions. Consider a function of two variables f(x, y) and a point (xo, yo) in its domain. The most general linear function L(x, y) with the property L(xo, yo) = f(xo, yo) reads L(x, y) = f(xo, yo)+n1(x -moo)+n2(y - yo), where ni and n2 are arbitrary numbers. It defines a linear approximation to f(x, y) near (xo, Yo) in the sense that L(xo, yo) = f(xo, yo). More generally, given a multivariable function f (r), a linear function L(r) = f(ro)+n - (r - ro) is said to be a linear approximation to f near ro in the sense that L(ro) = f(ro). The dot product is defined in an m-dimensional Eu- clidean space if f is a function of m variables. The vector n is an arbitrary vector so that L(r) is the most general linear function satis- fying the condition L(ro) = f(ro). Note that in the case of two variables x1 = x and x2 = y, n = (ni, n2) and r - ro = (x - xo, y - yo) so that n (r - ro) = n1(x - xo) + in2(y - yo). DEFINITION 13.17. (Differentiable Functions). The function f of several variables r =(x1, x2, ..., xm) on an open set D is said to be differentiable at a point ro e D if there exists a good  90. LINEARIZATION OF MULTIVARIABLE FUNCTIONS 213 linear approximation L(r), i.e. a linear approximation L(r) for which (13.6) urn f(r) - L(r) r-r0 |r - ro|| If f is differentiable at all points of D, then f is said to be differentiable on D. By this definition, the differentiability of a function is independent of the coordinate system chosen to label points of D (a linear function remains linear under general rotations and translations of the coordi- nate system and the distance ||r - roll is also invariant under these transformations). For functions of a single variable f(x), the existence of a linear approximation at xo with the property (13.6) is equivalent to the existence of the derivative f'(xo). Indeed, put x - xo = Ax. Then [f(x)-L(x)]/|Az= +[f (x)-L(x)]/Ax for all Ax / 0. Therefore, the condition (13.6) is equivalent to (13.5), which, in turn, is equivalent to the existence of f'(xo) as argued above. THEOREM 13.10. A linear approximation L to a multivariable func- tion f near a point ro that satisfies the property (13.6) is unique if it exists. PROOF. Let L1(r) = f(ro)+ni-(r-ro) and L2(r) = f(ro)+n2-(r-ro) be two linear approximations that satisfy the condition (13.6) for which n2 / ni. Making use of the identity L2(r) - L1(r) - [f(r) - L1(r)] - [f(r) - L2(r)] it is concluded that lim L2(r) - L1(r) . f(r) - L1(r) . f(r) - L2(r) li =lm - lim =0. r-ro ||r - roll r-|ro r ll r- ro r - oll Note that owing to the existence of the limit (13.6) for both linear functions L1 and L2, the limit of the difference equals the difference of the limits (the basic law of limits). On the other hand, L2(r) - L1(r) = (n2 - nli)- (r - ro). Put n = n2 - ni. By assumption, n| / 0. Then L2(r) - L1(r) n - (r - ro) n - r 0 = lim = lim = lim . r- ro ||r - roll r- ro ||r - roll r-O ||r|| If a multivariable limit exists, then its value does not depend on a path along which the limit point is approached. In particular, take the straight line parallel to n, r =nt, t -~ 0±, in the above relation. Then along this line, r/llr|= n/lln| and hence 0 =lim ||n| -> nO = - i=n t-o+ ||n| ||n|  214 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS which is a contradiction. Thus, a good linear approximation is unique if it exists, L1(r) = L2(r). D 90.2. Differentiability and Partial Derivatives. In the one-variable case, a function f(x) is differentiable at xo if and only if it has the derivative f'(xo). Also, the existence of the derivative at xo implies continuity at xo (recall Calculus I). In the multivariable case, the relations between differentiability, continuity, and the existence of partial derivatives are more subtle. THEOREM 13.11. (Properties of Differentiable Functions). If f is differentiable at a point ro, then it is continuous at ro and its partial derivatives exist at ro. PROOF. The property (13.6) requires that the error of the linear ap- proximation decrease faster than the distance ||r - roll: (13.7) f (r) = L(r) + e(r)||r - roll, where (r) - 0 as r - ro . A linear function is continuous (it is a polynomial of degree 1). There- fore, L(r) - L(ro) = f(ro) as r - ro. By taking the limit r - ro in (13.7), it is concluded that f(r) - f(ro). Hence, f is continuous at ro. If the multivariable limit (13.6) exists, then it does not depend on a path approaching the limit point. In particular, take a straight line parallel to the jth coordinate axis. If ed is the unit vector parallel to this axis, then vector equation of the line is r = r(t) = ro + tel. Then r-roll =|t| - 0 as t- 0 along the line, and f(r(t)) - L(r(t)) = f (ro + tee) - f (ro) - n- ef t = f(ro + tee) - f(ro) -nit, where n3 is the jth component of the vector n. By the same reasoning as in the one-variable case with AOx= t (given after Definition 13.17), the condition (13.6) implies lim - e ~r)_ng = 0 ng = fx (ro) two according to Definition 13.16 of partial derivatives at a point. The existence of partial derivatives is guaranteed by the existence of the limit (13.6). D The following important remarks are in order. In contrast to the one-variable case, the existence of partial derivatives at a point does not generally imply continuity at that point. EXAMPLE 13.24. Consider the function {fg if x, ) - (0, 0) f~x~) {2±Y if (x, y) =(0, 0)  90. LINEARIZATION OF MULTIVARIABLE FUNCTIONS 215 Show that this function is not continuous at (0, 0), but that the partial derivatives f,(0, 0) and fy(0, 0) exist. SOLUTION: In order to check the continuity, one has to calculate the limit lim(x,Y)-(o,o) f(x, y). If it exists and equals f (0, 0) = 0, then the function is continuous at (0, 0). This limit does not exist. Along lines (x, y) = (t, at), the function has constant value f(t, at) =at2/(t2 + a2t2) = a/(1 + a2) and hence does not approach f (0, 0) = 0 as t - 0+. To find the partial derivatives in question, note that f (x, 0) = 0 for all x, which implies that its rate along the x axis vanishes, fx(x, 0) = 0. Similarly, the function vanishes on the y axis, f(0, y) = 0 and hence f (0, y) = 0. In particular, the partial derivatives exist at the origin, fX (0, 0) = fy(0, 0) = 0. E This example shows that both the continuity of a function and the existence of its partial derivatives at a point are necessary conditions for differentiability of the function at that point. In contrast to the one- variable case, they are not sufficient; that is, the converse of Theorem 13.11 is false. A good linear approximation in the sense of (13.6) (or (13.7)) may not exist even if a function is continuous and has partial derivatives at a point. EXAMPLE 13.25. Let X if (x,y) / (0,0) f (x, y) = x2+y2 0 if (x, y) (0, 0) Show that f is continuous at (0, 0) and has the partial derivatives f(0, 0) and f;(0,0), but it is not differentiable at (0,0). SOLUTION: The continuity is verified by the squeeze principle. Put r = /2 + y2. Then Izyl = Izlly| (o,o) f;(x, y) does not exist, which means that the par- tial derivative f;(x, y) is not continuous at the origin. Owing to the symmetry f(cc, y) =f(y, cc), the same conclusion holds for f(cc, y). D 90.5. Exercises. (1) Let f (cc,y) =ccy2 if (c,y) / (0, 0) and f (0, 0) =1 and let g(cc,y)= cc I+ v/y. Are the functions f and g differentiable at (0, 0)?  220 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (2) Let f (x, y) = zy2/(x2 + y2) if X2 + y2 / 0 and f (0, 0) = 0. Show that f is continuous and has bounded partial derivatives fj and f , but it is not differentiable at (0, 0). Investigate the continuity of the partial derivatives near (0, 0). (3) Show that the function f (x, y) = y is continuous at (0, 0) and has the partial derivatives fj(0, 0) and f (0, 0), but it is not differen- tiable at (0, 0). Investigate the continuity of the partial derivatives fx and f near the origin. (4) Let f (x, y) = x3/(x2 + y2) if (x, y) / (0, 0) and f (0, 0) = 0. Show that f is continuous, has partial derivatives at (0, 0), but is not differ- entiable at (0, 0). (5) Find the domain in which the following functions are differentiable: (i) f (x, y) = y/ (ii) f (x, y) =xy (iii) f(x, y, z)= sin(xy + z)ezY (iv) f (X, y, z)= =v 2 + y2 - z2 (v) f (r) = ln(1 - |r||), where r = (zi, X2, ..., gym) (vi) f (X, y) = (z__3 (vii) f (r) = e-1/llr|P if r / 0 and f (0) = 0, where r= (zi, x2, ..., xzm) Hint: Show fe ,(0) = 0. Investigate the continuity of the partial deriva- tives. (6) The line through a point Po of a surface perpendicular to the tan- gent plane at Po is called the normal line. Find an equation of the tangent plane and symmetric equations of the normal line to each of the following surfaces at the specified point: (i) z = X2 + 3y - y4x, (1,2,-1) (ii) z = /3y, (1,4,2) (iii) z = y ln(x2 - 3y), (2, 1, 0) (iv) y = tan-1(zz2 _1(i,-) (v) x = z cos~y - z), (1, 1,1) (vi) z = y + ln(z/z), (1, 1, 1) (7) Find the linearization of each of the following functions at the specified point: (i) f (x, y) _= g, (0, 0) (ii) f(x, y, z) = z1/3 /c + cos2(y), (0, 0, 1) (iii) f(r) =sin(n - r), r =ro, where n is a fixed vector orthogonal to ro and r =(zci, xc2, ..., ccm) (8) Use the linearization to approximate the following numbers. Then use a calculator to find the numbers. Compare the results.  91. CHAIN RULES AND IMPLICIT DIFFERENTIATION 221 (i) /20 - 7x2 - x2, where (x, y) = (1.08, 1.95) (ii) zy2z3, where (x, y, z) = (1.002, 2.003, 3.004) (iii) 3(1.03)2 0.98 V(1.05)2 (iv) (0.97)1.05 (9) Consider the equation f(x, y, z) = 0 that has a root z = z(x, y) for every fixed pair (x, y). Suppose that f(xo, yo, zo) = 0 and f is differentiable at (xo,yo,zo) so that f (xo,yo,zo) / 0. If L(x, y, z) is the linearization of f at (zo, yo, zo), the equation L(x, y, z) = 0 is a called a linearization of the equation f(x, y, z) = 0. Its solution determines an approximation to the root z = z(x, y) near (zo, yo). Find this approximation, and use the result to solve the equation yz ln(1 + zz) - x ln(1 + zy) = 0 for z = z(x, y) near the point (1, 1, 1). In particular, estimate the root z at (x, y) = (0.8, 1.1). (10) Suppose that a function f (x, y) is continuous with respect to x at each fixed y and has a bounded partial derivative f((x, y), that is, fy (x, y)| M for some M > 0 and all (x, y). Prove that f is continu- ous. 91. Chain Rules and Implicit Differentiation 91.1. Chain Rules. Consider the function f(x, y) = x3 + cy2 whose domain is the entire plane. Points of the plane can be labeled in a different way. For example, the polar coordinates x = r cos 0, y r sin 0 may be viewed as a rule that assigns an ordered pair (x, y) to an ordered pair (r, 0). Using this rule, the function can be expressed in the new variables as f (r cos 8, r sinOB) = r3 sinO = F(r,OB). One can compute the rates of change of f with respect to the new variables: 64 _ F _f _F -- F 3r2sinO, -f F- r3 cos 0. Or Or '86W 86 Alternatively, these rates can be computed as Of - -- a+ =- (3x2 + y2) cos0B + 2xy sin0 68=3r2 sin0B, Or OX Or By Or f -f- +&fc y -(3x2 + y2)r sinO0 + 2xyr cosO 60=r3 cos8, where c and y have been expressed in the polar coordinates to obtain the final expressions. The latter relations are an example of a chain rule for functions of two variables. Suppose that the rates fj(zco, yo) and f (zco, yo) are known at a particular point (zco, yo). Then, by using  222 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS the chain rule, an explicit form of the function f in the new variables is not required to find its rates with respect to the new variables because the rates 4', 4', y,, and ye at (ro, Oo) corresponding to (x, yo) can easily be computed. On the other hand, consider the function f(x, y) = y3/(x2 + y2) if (x, y) / (0, 0) and f(0, 0) = 0. This function is continuous at the origin (if R2 = x2 + y2, then |f(x, y) - f(0, 0)| y3/R2 < R3/R2 = R - 0 as R -- 0). It has partial derivatives at the origin. Indeed, f(x, 0) = 0, and hence f'(x, 0) = 0 for any x and, in particular, f'(0, 0) = 0. Similarly, f (0, y) = y, and hence f'(0, y) = 1 so that f'(0, 0) = 1. Let x = t cos 0 and y = t sin 0, where 0 is a numerical parameter. Then F(t) = f(t cosOB, t sinOB) = t3sin38/t2 = t sin3O8. Therefore, F'(t) sin3 0. This implies that F'(0) = sin3 0. However, the chain rule fails: F'(0) = df/dt t=o = f'(0,0)x'(0) + f'(0,0)y'(0) =sinO8. It is not difficult to verify that the chain rule df/dt= f'zc'(t) + f'y'(t) is true for all t / 0. The reader is advised to verify that the function is not differentiable at (0, 0) (see Study Problem 13.12), which is the reason for the chain rule is not valid at that point. It appears that, in contrast to the one-variable case, the mere existence of partial derivatives is not sufficient to validate the chain rule in the multi-variable case, and a stronger condition of f is required. THEOREM 13.13. (Chain Rule). Let f be a function of n variables r= (x1, x2, ..., xn). Suppose that each variable xi is, in turn, a function of m variables u = (u1, u2, ...,um) The composition of xi = x(u) with f(r) defines f as a function of u. If the functions x2 are differentiable at a point u and f is differentiable at the point r = (x1(u), x2(u), ..., x(u)), then the rate of change of f with respect to u, j = 1, 2, ..., m, reads Of f 64ax1 O X2 O xa _ 4of &x u; &X1 lu x+ &2 &u; +.. Lan * xi & &Bu; PROOF. Since the functions zc(u) are differentiable, the partial deriva- tives &xc/auk exist and define a good linear approximation in the sense (13.7). In particular, for a fixed value of u and for every i, ax iu eg)-iu = h+e(h)|h|, ei(h) - 0 as h-a 0. Define the vector Arh (Azi, Ax2, ..., Aza). It has the property that Arh - O as h- 0. If F(u) =f(zi(u), x2(u), ...,xz,(u)), then, by the  91. CHAIN RULES AND IMPLICIT DIFFERENTIATION 223 definition of the partial derivatives at a point u, Of -=urn F(u + eh) - F(u) - rn f (r + Arh) - f (r) &uk h-0 h h-O h if the limit exists. By the hypothesis, the function f is differentiable and hence has partial derivatives of/&xi at the point r = (xi(u), x2(u), ..., xz(u)) that determine a good linear approximation (13.8) in the sense of (13.7): f(r+ Arh) - f (r) 64 A 4+ Ax2+- - + Axz+e(Arh)Arh, where c(Arh) - 0 as Arh - 0 or as h - 0. The substitution of this relation into the limit shows that the limit exists, and the conclusion of the theorem follows. Indeed, the first n terms contain the limits lim = + lim ei(h)- - h-0 h &uk h-O h ouk because h/h =+1 for all h / 0 and e2(h) - 0 as h -a 0. The ratio |Arh|/|h [(Axi/h)2 + (Ax2/h)2 + ... + (Ax/h)2]1/2 - M < 00 as h - 0, where M is determined by the partial derivatives &xi/u3. Therefore, the limit of the last term vanishes: lim e(Arh) h = lim (Arh)Chrh h-~0 h -0h |h| h| |Arh| = limc(Arh) -*lim = 0 - M = 0 h~0 h h-0 |h| because c(Arh) hl/h =+te(Arh) if h / 0 and c(Arh) - 0 as h - 0. D It is clear from the proof that the partial derivatives Of/x2 in the chain rule are taken at the point (xi(u),x2(u),...,xz(u)). For n m = 1, this is the familiar chain rule for functions of one variable df/du = f'(x)x'(u). If n = 1 and m > 1, it is the chain rule (13.2) established earlier. The example of polar coordinates corresponds to the case n = m = 2, where r = (x, y) and u = (r, 0). EXAMPLE 13.29. Let a function f(x, y, z) be differentiable at ro (1, 2, 3) and have the following rates of change: f(ro) = 1, f'(ro) = 2, and f'(ro) = -2. Suppose that x = x(t, s) = t2s, y = y(t, s) = s + t, and z = z(t, s) = 3s. Find the rates of change f with respect to t and s at the point ro. SoLUTIoN: In the chain rule, put r =(x,y, z) and u =(t, s). The point ro (1, 2, 3) corresponds to the point u11= (1, 1) in the new variables. Note that z =3 gives 3s =3 and hence s =1. Then, from  224 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS y =2, it follows that s + t = 2 or 1 + t = 2 or t = 1. Also, x(1, 1) = 1 as required. The partial derivatives of the old variables with respect to the new ones are z'= 2ts, y' = 1, z' = 0, ' = t2, y' = 1, and z' = 3. They are continuous functions and hence x(t, s), y(t, s), and z(t, s) are differentiable by Theorem 13.12. By the chain rule, f(ro) f'(ro)x'(uo) + f'(ro)y'(uo) + f'(ro)z'(uo) =1-2+2 -1+ (-2)-0=4, f'(ro) f'(ro)x'(uo) + f'(ro)y'(uo) + f'(ro)z'(uo) - 1- 1 + 2 -1 + (-2) - 3 =-3. D- EXAMPLE 13.30. Let fv(x, y, z) of change of f along the curve r(t) increasing t. z2(1 + X2 + 2y2)-1. Find the rate (sint,cost,et) in the direction of SOLUTION: The function f is differentiable as the ratio of two polyno- mials (its partial derivatives are continuous): 2xz2, f, (1 + x2 +2y2) 2' y 4yz2 (1 +x2 +2y2)2, z The components of r(t) are also differentiable: x'(t) - sin t, z'(t) = et. By the chain rule for n = 3 and m 2z 1 + x2 + 4y2- = cost, y'(t) = = 1, 2et t + 2 et 2+ cos2 t df dt f'(r(t))z'(t) + f'(r(t))y'(t) + f'(r(t))z'(t) 2e2t sin t - -____ (cost) - (2-+-cos2t)2( e2t(5 + sin(2t) + cos(2t)) (2-+-cos2t)2 4e2t cos t c (-sint) (2 + cos2t)2 where 2 sin t cos t = sin(2t) and 2 cos2 t = 1+ cos(2t) have been used. D The chain rule can be used to calculate higher-order partial deriva- tives. EXAMPLE 13.31. If g(u, v) = f (x, y), where xc= (u2-v2)/2 andy uv, find g'y, . Assume that f has continuous second partial derivatives. If f'(1, 2) = 1, f"(1,2) = f"y(1, 2) = 2, and f"'(1,2) = 3, find the value of g", at (x,y) = (1,2). SOLUTION: One has z' = u, zx', = -v, y =v, and y'> =u. Then g' = f'+f'y' = f'u+fv.  91. CHAIN RULES AND IMPLICIT DIFFERENTIATION 225 The derivative g", = (g')', is calculated by applying the chain rule to the function g': gu = u(f' )+v(f') )+ f' u(ffx' + f",y') + v(ffx' + f"y') + f' u(-vf", + of2) + v(-vf", + uffy) + f' = uv(f" - f") + (u2 - v2)f"' + f=' =y(f", - f") + 2xf", + f, where f" f"x has been used. The value of g", at the point in question is 2-(2-2)+2.3+1 =7. 91.2. Implicit Differentiation. Consider the function of three variables, F(x, y, z) = x2 + y4 - z. The equation F(x, y, z) = 0 can be solved for one of the variables, say, z, to obtain z as a function of two variables: F(x,y, z) = 0 z = z(x,y) = x2 +y4; that is, the function z(x, y) is defined as a root of F(x, y, z) and has the characteristic property that (13.9) F (xy,z(x,y)) = 0 for all (x,y). In the example considered, the equation F(x, y, z) = 0 can be solved analytically, and an explicit form of its root as a function of (x, y) can be found. In general, given a function F(x, y, z), an explicit form of a so- lution to the equation F(x, y, z) = 0 is not always possible to find. Putting aside the question about the very existence of a solution and its uniqueness, suppose that this equation is proved to have a unique solution when (x, y) E D. In this case, the function z(x, y) with the property (13.9) for all (x, y) E D is said to be defined implicitly on D. Although an analytic form of an implicitly defined function is un- known, its rates of change can be found and provide important in- formation about its local behavior. Suppose that F is differentiable. Furthermore, the root z(X, y) is also assumed to be differentiable on an open disk D in the plane. Since relation (13.9) holds for all (x, y) E D, the partial derivatives of its left side must also vanish in D. They can be computed by the chain rule, nm 3, mr= 2, r =(x,y, z), and u =(tt, v), where the relations between old and new variables are cc =t, y =v, and z =z(i, v). One has z'c 1, 4' 0, y'$= 0, y =1, and z',(u, v) =4z(xc,y) and z's(i, v) =z'(x,y) because cc =  226 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS and y = v. Therefore, & OF &FOz z F' F~,y ~,y)= + = 0X- '=-E o) u ax Oz ax Fz & 6F &F oz : F' Fv(x,y, zv(x,y)) + = 0 /o BOy Oz By z These equations determine the rates of change of an implicitly defined function of two variables. Note that in order for these equations to make sense, the condition F' / 0 must be imposed. Several questions about the very existence and uniqueness of z(x, y) for a given F(x, y, z) and the differentiability of z(x, y) have been left unanswered in the above analysis. The following theorem addresses them all. THEOREM 13.14. (Implicit Function Theorem). Let F be a function of n+1 variables, F(r, z), where r= (X1, x2, ..., fn) and z is real such that F and F' are continuous in an open ball B. Suppose that there exists a point (ro, zo) E B such that F(ro, zo) = 0 and F'(ro, zo) / 0. There exists an open neighborhood D of ro, an open interval I, and a unique function z : D -- I such that for (r, y) E D x I, F(r, y) = 0 if and only if y = z(r). Moreover, the function z is continuous. If, in addition, F is differentiable in B, then the function z = z(r) is differentiable in D and F' (r, z(r)) z, (r)= - F'(r, z(r)) for all r in D. The proof of this theorem goes beyond the scope of this course. It includes proofs of the existence and uniqueness of z(r) and its differen- tiability. Once these facts are established, a derivation of the implicit differentiation formula follows the same way as in the n = 2 case: OF +F Oz ,( F'(r,z(r)) axe &z &cxvF'(r, z(r)) Remark. If the function F has sufficiently many continuous higher- order partial derivatives, then higher order partial derivatives of z(r) can be obtained by differentiation of these relations. An example is given in Study Problem 13.9. EXAMPLE 13.32. Show that the equation z(3cc - y) =w7 sin (ccyz) has a unique solution z =z(x, y) in a neighborhood of (1, 1) such that z(1, 1) =wg/2 and find the rates of change z' (1, 1) and z' (1, 1).  91. CHAIN RULES AND IMPLICIT DIFFERENTIATION 227 SOLUTION: Put F(x, y, z) = 7 sin(xyz) - z(3x - y). Then the exis- tence and uniqueness of the solution can be established by verifying the hypotheses of the implicit function theorem in which r = (x, y), ro = (1, 1), and zo = 7/2. First, note that the function F is the sum of a polynomial and the sine function of a polynomial. So its partial derivatives F-' = yz cos(xyz) - 3z , F' = wxz cos(xyz) + z , F w= xy cos(xyz) - 3x + y are continuous for all (x, y, z); hence, F is differentiable everywhere. Next, F(1, 1, 7/2) = 0 as required. Finally, F'(1, 1, 7/2) = -2 / 0. Therefore, by the implicit function theorem, there is an open disk in the xy plane containing the point (1, 1) in which the equation has a unique solution z = z(x, y). By the implicit differentiation formulas, F'(1, 1, /2) F' (1, 1, 7/2) 37 4 ' ( I I F ( 1 , 1 , / 2 ) 4 In particular, this result implies that, near the point (1, 1), the root z(x, y) decreases in the direction of the x axis and increases in the direction of the y axis. It should be noted that the numerical val- ues of the derivatives can be used to accurately approximate the root z(x, y) of a nonlinear equation in a neighborhood of (1, 1) by linearizing the function z(x, y) near (1, 1). The continuity of partial derivatives ensures that z(x, y) is differentiable at (1, 1) and has a good linear approximation in the sense of (13.6) (see Study Problem 13.8). Q 91.3. Study Problems. Problem 13.8. Show a unique solution z = z(1, 1) = 7/2. Estimate that the equation z(3x - y) - w7sin(xyz) has z(x, y) in a neighborhood of (1, 1) such that z(1.04, 0.96). SOLUTION: In Example 13.21, the existence and uniqueness of z(x, y) has been established by the implicit function theorem. The partial derivatives have also been evaluated, z'(1, 1) -37/4 and z',(1, 1) 7/4. The linearization of z(x, y) near (1, 1) is z(1 + A, 1+'Ay) z(1,1) + z(1,1) Act+ z'(1,1)Ay 7 (1 3x Ay =-(1- 2+ 2 .1 Putting Acx= 0.04 and Ay z(1.04, 0.96) 0.45w. S-0.04, this equation yields the estimate D-  228 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS Problem 13.9. Let the function z(x, y) be defined implicitly by z5 + zx - y = 0 in a neighborhood of (1, 2,1). Find all its first and sec- ond partial derivatives. In particular, give the values of these partial derivatives at (x,y) = (1,2). SOLUTION: Let F(x, y, z) = z5 + zx - y. Then F' = 5z4 + x. The function z(x, y) exists in a neighborhood of (1, 2) by the implicit func- tion theorem because F(1, 2, 1) = 0 and F'(1, 2, 1)= 6 0. The first and second partial derivatives of F are continuous everywhere: F = z, F>=-1, F'>= 5z4+ x, F"= 0, F'"= 0, F"z = 1, F" =0, F"z = 0, F"z = 20z3. By implicit differentiation, F'_ z , F' 1 zX F' 5z4 + zF' 5z4 + x Taking the partial derivatives of these relations with respect to x and y and using the quotient rule for differentiation, the second partial derivatives are obtained: z'/(5z4 +x) -z(20z3z' + 1) _ (15z4 - x)z' + z ( 5z4 + z)2 (5z4 +x)2 z ~ ~20zaz' + 1 ,, 20z3z/ z z = (zy) X (5z4 + z)2 (5z4 + z)2' The explicit form of z' and z' may be substituted into these relations to express the second partial derivatives via x, y, and z. At the point (1, 2), the values of the first partial derivatives are z'(1, 2) = -1/6 and z'(1, 2) = 1/6. Using these values, the values of the second partial derivatives are evaluated: z"X(1, 2) -1/27, z"Y(1, 2) = 7/108, and z"/(1, 2) - -5/54. D 91.4. Exercises. (1) Use the chain rule to find dz/dt if z = /1+ x2 + 2y2 and x = 2t3 y =lnt. (2) Use the chain rule to find 3z/as and &z/&t if z = e-x sin(xy) and x=ts,y = s2+t2. (3) Use the chain rule to write the partial derivatives of F with respect to the new variables: (i) F =f(xc,y), = x~u, v, w), y =y~u,v, w) (ii) F =f(x, y, z,t), c = x(u, v), y =y(u, v), z =z(w, s), t= t(w, s)  91. CHAIN RULES AND IMPLICIT DIFFERENTIATION29 229 (4) Find the rates of change &z/&tu, &z/&v, &z/w when (ut, v, w)_ (2, 1, 1) if z =X2 +yx+ y3 and x = iv2 +w3, y= u + vlnw. (5) Find the rates of change Of /&u, &f /&v, &f /&w when (xc, y, z)_ (1/3, 2, 0) if x =2/u - v +w, y =vuw, z =e21. (6) If z(i, v) =f(x, y), where xcu ecos v and y eacusin v, show that 4ii + zii=e-2s (zii+)Ifx y lu V). (7) If z(Li, v) =f (x, y), where x =tt2 + v2 and y =2Liv, find all the second-order partial derivatives of z(Li, v). (8) If z(i, v)= f(cc,y), where xc=Lu + v and y =Lu - v, show that (z)2 + (z/ )2 - z/ z/, (9) Find all the first and second partial derivatives of the following functions: (i) g(, Y, z) - f(c2+y2+z2) (ii) g(, Y) =f (Xc, /y) (iii) g(cc, Y, z) - f (cc, cy, ccyz) (iv) g(c, Y) =f (c/y, y/cc) (V) g(, Y,z) f(cc+y+z, x2 +y2 +z2) (vi) g(c, Y) =f (x + Y, y) (10) Find g~ +Jzif g(x y z) =f(c+y+z,c2 y2 -1z2). (11) Let xc r cosO8 and y =rsinOB. Show that 1 a (&OfN 1 a2f (12) Let xc p sin 0 cosOB, y =p sin i sinOB, z =p cos i5. The variables (p, 0, 0) are called spherical coordinates and discussed in Section 104.3. Show that Op&f 1 1 si &a _ O 1 1 &2f fxx+ y~zz p2&OP Kap sn+ sn ( J)p25112 0 02. (13) Prove that if a function f (x, y) satisfies the Laplace equation fxx + fey 0, then the function g(cc, y)= f(cc/(cc2 + y2), y/(cc2 + y2)), x 2 + y 2 > 0, also satisfies the Laplace equation. (14) Prove that if a function f (x, t) satisfies the diffusion equation ft=a2 fj§, then the function also satisfies the diffusion equation.  230 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (15) Prove that if f(x, y, z) satisfies the Laplace equation f"x + f" + f"z = 0, then the function 1 a2x a2a2z 2 2 2 g(x,y,z)=--f 2 2 2 , , xz+y2+ Z20, also satisfies the Laplace equation. (16) Show that the function g(x, y) = x f(y/x2), where f is a differ- entiable function, satisfies the equation xg' + 2yg' = ng. (17) Show that the function g(x, y, z) =xnf(y/xa, z/xb), where f is a differentiable function, satisfies the equation xg' + ayg' + bzg' = ng. (18) Let the function z = f(x, y) be defined implicitly. Find its first and second partial derivatives if (i) x + 2y + 3z = ez (ii) x - z = tan--1(yz) (iii) z/z = ln(z/y) + 1 (19) Let z(x, y) be a solution of the equation z3 - xz + y= 0 such that z(3, -2) = 2. Find the linearization of z(x, y) near (3, -2) and use it to estimate z(2.8, -2.3). (20) Find f' and f', where f = (x + z)/(y + z) and z is defined by the equation zez =czex + yen. (21) Show that the function z(x, y) defined by the equation F(x - az, y - bz) = 0, where F is a differentiable function of two variables and a and b are constants, satisfies the equation az' + bz' = 1. (22) Let the temperature of the air at a point (x, y, z) be T(x, y, z) degrees Celsius. Suppose that T is a differentiable function. An insect flies through the air so that its position as a function of time t, in seconds, is given by x =/1 + t, y= 2t, z = t2 - 1. If T'(2, 6, 8) = 2, T'(2, 6, 8) -1, and T'(2, 6, 8) = 1, how fast is the temperature rising (or decreasing) on the insect's path as it flies through the point (2, 6, 8)? (23) Consider a function f= f(x, y, z) and the change of variables: x = 2uv, y = 2 - v2 + w, z = u3vw. Find the partial derivatives f', f', andf' atthepointu= v=w= 1, if f'=a, f ' =b, and f'=cat (x,y,z) - (2,1,1). (24) Let a rectangular box have the dimensions x, y, and z that change with time. Suppose that at a certain instant the dimensions are xc= 1 m, y = z = 2 m, and c and y are increasing at the rate 2 m/s and z is decreasing at the rate 3 in/s. At that instant, find the rates at which the volume, the surface area, and the largest diagonal are changing. (25) A function is said to be homogeneous of degree n if, for any number t, it has the property f(tzc, ty) =tmf(cc, y). Give an example of a polynomial function that is homogeneous of degree n. Show that a  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 231 homogeneous differentiable function satisfies the equation xf' + yf' = nf. Show also that f'(tz, ty) = t-1f(x, y). (26) Suppose that the equation F(x, y, z) = 0 defines implicitly z = f (x, y) or y = g(x, z) or x = h(y, z). Assuming that the derivatives F's, F', and F' do not vanish, prove that (&z/&x)(&x/y)(&y/&z) = -1. (27) Let x2 = vw, y2 = uw, z2 = uv, and f(x, y, z) = F(u, v, w). Show that of' + yf' + zf' = nF' + vF' + wF',. (28) Simplify z' seccc + z' secy if z = sin y + f (sinx - sin y), where f is a differentiable function. 92. The Differential and Taylor Polynomials Just like in the one-variable case, given variables r = (zi, X2, ..., zgm), one can introduce independent variables dr = (dzi, dz2, ..., dzcm) that are infinitesimal variations of r and also called differentials of r. DEFINITION 13.19. (Differential). Let f(r) be a differentiable function. The function df(r) = f' (r) dzi + f'2 (r) dz2 + ... + f'm (r) dzcm is called the differential of f. The differential is a function of 2m independent variables r and dr. Consider the graph y= f(x) of a function f of a single variable c (see Figure 13.8, left panel). The differential df(zo) = f'(zo) dc at a point z0 determines the increment of y along the tangent line y = L(x) f (zo)+f'(-o)(x-o) as x changes from z0 to co0+A, where Acc= dc. Similarly, the differential df(o, yo) of a function of two variables at a point Po = (co, yo) determines the increment of z = L(x, y) along the tangent plane to the graph z= f(cx, y) at the point (co, yo, f(o, Yo)) when (x, y) changes from (co, yo) to (zo + Ac, yo + Ay), where dcc Ac and dy = Ay, as depicted in the right panel of Figure 13.8. In general, the differential df(ro) and the linearization of f at a point ro are related as L(r) = f (ro) + df (ro) , dzi = 0x2 , i = 1, 2, ..., m; that is, if the infinitesimal variations (or differentials) dr are replaced by the deviations Ar = r - ro of the variables r from ro, then the differential df at the point ro defines the linearization of f at ro. Ac- cording to (13.6), the difference f(r) - f(ro) - df(ro) tends to 0 faster than |Ar| as Ar -~ 0, and hence the differential can be used to study variations of a differentiable function f under small variations of its arguments.  232 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS yzzf(x) --x+A) yzL(x) d xo xo +Ax z=f(,y) (Az x, Y) FIGURE 13.8. Geometrical significance of the differential. Left: The differential of a function of one variable. It defines the increment of y along the tangent line y = L(x) to the graph y = f(x) at (xo, yo), yo = f(xo), when x changes from xo to xo+Ax, where dx Ax. As Ax - 0, (Ay - df)/Ax Ay/Ax-f'(xo) - 0; that is, the difference Ay-df tends to 0 faster than Ax. Right: The differential of a function of two variables. It defines the increment of z along the tangent plane z = L(x, y) to the graph z = f(x, y) at (xo, yo, zo), zo = f(xo, yo), when (x, y) changes from (xo, yo) to (xo + Ax, yo + Ay), where dx Ax and dy = Ay. The difference Az - df tends to 0 faster than Ar = (Ax)2 + (Ay)2 as Ar - 0. EXAMPLE 13.33. Find df (x, y) if f (x, y) = 1 + x2y. In particu- lar, evaluate df (1, 3) for (dx, dy) = (0.1, -0.2). What is the significance of this number? SOLUTION: The function has continuous partial derivatives in a neigh- borhood of (1, 3) and hence is differentiable at (1, 3). One has df (x, y) = f'(x, y) dx + f'(x, y) dy xy dx x2 dy + . 1+x2y 2 1+x2y Then 3 1 df (1, 3) = -dx + -dy = 0.15 - 0.05 = 0.1. 2 4 The number f(1, 3) + df(1, 3) defines the value of the linearization L(x, y) of f at (1, 3) for (x, y) = (1 + dx, 3 + dy). It can be used to approximate f (1 + dx, 3 + dy) - f (1, 3) ~ df (1, 3) when dx and dy are small enough. In particular, f (1 + 0.1, 3 - 0.2) - f (1, 3) = 0.09476 (a calculator value), which is to be compared with df(1, 3) = 0.1. Q 92.1. Error Analysis. Suppose a quantity f depends on several other quantities, say, x, y, and z, for definiteness, that is, f is a function f (x, y, z). Suppose measurements show that x = xo, y = yo, and z = zo. Since, in practice, all measurements contain errors, the value f (xo, yo, zo) does not have much practical significance until its error is determined.  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 233 For example, the volume of a rectangle with dimensions x, y, and z is the function of three variables V(x, y, z) x=zyz. In practice, repetitive measurements give the values of x, y, and z from intervals x E [xo - 5x, xo + 6X], y E [Yo - 5y, yo + 6y], and z E [zo - oz, zo + 5z], where ro = (xo, Yo, zo) are the mean values of the dimensions, while or = (5x, 5y, 5z) are upper bounds of the absolute errors or the maxi- mal uncertainties of the measurements. To indicate the maximal uncer- tainty in the measured quantities, one writes x = zo t ox and similarly for y and z. Different methods of the length measurement would have different absolute error bounds. In other words, the dimensions x, y, and z and the bounds &x, 5y, and oz are all independent variables. Since the error bounds should be small (at least one wishes so), the val- ues of the dimensions obtained in each measurement are x =zo + dx, y yo + dy, and z = zo + dz, where the differentials can take their values in the intervals dx E [-5x, 5x] = Ioz and similarly for dy and dz. The question arises: Given the mean values ro = (zo, Yo, zo) and the absolute error bounds or, what is the absolute error bound of the volume value calculated at ro? For each particular measurement, the error is V(ro + dr) - V(ro) = dV(ro) if terms tending to 0 faster than dr|| can be neglected. The components of dr are independent vari- ables taking their values in the specified intervals. All such triples dr correspond to points of the error rectangle R68= Io x Icy x Iz. Then the absolute error bound is 5V = I max dV(ro)|, where the maximum is taken over all dr E R6. For example, if ro = (1, 2, 3) is in centimeters and or = (1, 1, 1) is in millimeters, then the absolute error bound of the volume is 5V = | max dV(ro) max(yozo dx+xozo dy+xoyo dz) 0.6 + 0.3 + 0.2 = 1.1 cm3, and V = 6 + 1.1 cm3. Here the maximum is reached at dx= dy = dz = 0.1 cm. This concept can be generalized. DEFINITION 13.20. (Absolute and Relative Error Bounds). Let f be a quantity that depends on other quantities r = (xi, x2, ..., xm) so that f = f(r) is a differentiable function. Suppose that the val- ues x = a2 are known with the absolute error bounds 6x2. Put ro0= (a1, a2, ..., am) and of = | max df (ro) where the maximum is taken over all dx E [-5x2, zx2]. The numbers of and, if f (ro) 0 0, 5f /|f(ro)| are called, respectively, the absolute and relative error bounds of the value off at r = ro. In the above example, the relative error bound of the volume mea- surements is 1.1/6 ~0.18; that is, the accuracy of the measurements is about 18%. In general, since df (ro) = fj (ro) dzi i=1  234 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS is linear in dxi, and ro is fixed, the maximum is attained by setting dx equal to ozi for all i for which the coefficient f' (ro) is positive, and equal to -oxc for all i for which the coefficient f' (ro) is negative. So the absolute error bound can be written in the form m of Z If' (ro)|xi. i=1 92.2. Accuracy of a Linear Approximation. If a function f(x) is differ- entiable sufficiently many times, then its linear approximation can be systematically improved by using Taylor polynomials (see Calculus I and Calculus II). The Taylor theorem asserts that if f(x) has continu- ous derivatives up to order n on an interval I containing xo and f(n+1) exists and is bounded on I, If(n+l)(x)| < Mn+1 for some constant Mn+1, then f(x)= Tn(x) + en.1(x),( Tn(x)= f (xo) + f(xo)A (+xo)A2 +...+ f nXo)A", 1! 2! n (13.10) len+ 1( )|< z1_-_o,n+l where A= x-xo. The polynomial Tn(x) is called the Taylor polynomial of degree n. The remainder en+1(x) determines the accuracy of the approximation f(x) Tn(x). The first-order Taylor polynomial is the linearization of f at x = zo, T1(x) = L(x), and the remainder 2(x) determines the accuracy of the linear approximation: (13.11) le2(x)| =_If (x) - L(x)| < 22 The differential df can be viewed as the result of the action of the operator d = dx(d/dx) on f: df = dxf'. If the variation A of x is viewed as an independent variable, like the differential dx, then it is convenient to introduce higher-order differentials of f by the rule dnf (x) = f(")(x)(dx)" ( d f (x), dx) where the action of the powers d" on f is understood as successive actions of the operator d, dmf =d"-1(df), in which the variables dx and x are independent. For example, d2f = dx(d/dz)(f'dz)- (dz)2(d/dz)f' =(dz)2f" (when differentiating, the variable dx is viewed  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 235 as a constant). Then the Taylor polynomial Tn(x) about co is Tn(x) = f (zo) + df (o) + +d2f (z0) +--- + kd f (zo), 1!2! n! where dc = x - c0. It represents an expansion of f(o + dc) in powers of the differential dc. The Taylor polynomial approximation f(co + dc) Tc(x) is an approximation in which the contributions of higher powers (dc)k, k > n, are neglected, provided f is differentiable sufficiently many times. The Taylor theorem ensures that this approx- imation is better than a linear approximation in the sense that the approximation error decreases faster than (dc) = (x - co)< as x - czo. It also provides more information about a local behavior of the function near a particular point z0 (e.g., the concavity of f near zo). Naturally, this concept should be quite useful in the multivariable case. 92.3. Taylor Polynomials of Two Variables. Let f(x, y) be a function of two variables. The differentials dc and dy are another two independent variables. By analogy with the one-variable case, the differential df is viewed as the result of the action of the operator d on f df (xy)=dcc] + dyJ) f(xy) = dcf'(x,y) + dyf'(x, y). Ox O f y DEFINITION 13.21. Suppose that f has continuous partial deriva- tives up to order n. The quantity dmf (X, y)=dz + dy f (xY) &xBy is called the n-th order differential of f, where the action of powers d" on f is defined successively dmf = d"-1(df) and the variables dc, dy, x, and y are viewed as independent when differentiating. The differential dmf is a function of four variables dx, dy, x, and y. For example, \d2 2d J f (dic] +d )(dcf2+ ((dc) fHd cc dY &yf> +(dccy I(dxfY)2 Jy) f"(dc)2 + 2f" dccdy + f"(dy)2 By the continuity of partial derivatives, the order of differentiation is irrelevant, f ". The numerical coefficients at each of the terms are binomial coefficients: (a + b)2 =a2 + 2ab + b2. Since the order of  236 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS differentiation is irrelevant (Clairaut's theorem), this observation holds in general: d f=ZB" k(dz)"-k(dy)k, B"= k0 rvnkgyk k- k=o where Bk are the binomial coefficients: (a + b)"n = LEn0 Bkan-kbk. EXAMPLE 13.34. Find do f if f(x, y) = eax+by, where a and b are constants. SOLUTION: Since f =eax+bY and f' = beax+by, df = aeax+by dx + beax+bY dy _ eax+by(adx + bdy). Since f" = a2eax+by f/ =abeax+bx, and f" =b2 eax+bx d2f = a2eax+bY (dc)2 + 2abeax+bY dx dy + b2eax+by(dy)2 =eaxbY (a2 (dc) 2 + tab dccdy + b2 (dy) 2)= eax~bY (a dxc + b dy) 2. Furthermore, by noting that each differentiation with respect to c brings down a factor a, while the partial derivative with respect to y brings down a factor b, it is concluded that &"f /x-ck &yk -=a-kbkeax+by. Using the binomial expansion, one infers d" f = eax+byn(a dc + bdy< for all n = 1, 2, .... DEFINITION 13.22. (Taylor Polynomials of Two Variables) Let f have continuous partial derivatives up to order n. The Taylor polynomial of order n about a point (co, yo) is 1 1 1 Tn(x, y)= f (co,yo)+-df(co,yo)+- 2!d2f(cco,yo)+...+-dfn( ,y), where dc =x - co and dy= y -yo. For example, put r = (x, y), ro = (co, yo), dc = x - c0, and dy y - Yo. The first four Taylor polynomials are To(r) = f (ro), T1(r) = f(ro) + f'(ro) dc + f'(ro) dy = L(r) T12(r) =T1(r) + f",(ro) (dc)2 +f" (ro) dccdy+ f(r) (y2, 2 2 f"'(o) f" (ro)d)2 T3(r) = T2(r) + f (dc)3 + f(ro) d2dy 6 2 ~c + 2" dc(dy)2 + 6" (dy)3.  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 237 The linear or tangent plane approximation f(r) L(r) = T1(r) is a particular case of the Taylor polynomial approximation of the first degree. EXAMPLE 13.35. Let Pn((x, y) be a polynomial of degree n. Find its Taylor polynomials about (0, 0). In particular, find Taylor polynomials for P3(x,y) = 1+2x-cy+y2 +4x3-y2. SOLUTION: All partial derivatives of Pn of order higher than n vanish. Therefore, dkPn = 0 for k > n, and hence for any polynomial of degree n, Tn = Pn and also Tk = Pn if k > n. Any polynomial can be uniquely decomposed into the sum Pn = Qo + Qi +""" -+ Q,, where Qk is a homogeneous polynomial of degree k; it contains only monomials of degree k. The differential dkf is a homogeneous polynomial of degree k in the variables dx and dy. Therefore, Definition 13.22 defines Tk as the sum of homogeneous polynomials in x and y if (co, yo) = (0, 0). Two polynomials are equal only if the coefficients at the corresponding monomials match. It follows from TT,= P, that TT,= T_1 + (T, - T-1) = Qo+Qi+- --+Qn-1+Qn. Since Qh and T -Tn_1 contains only monomials of degree n, the equality is possible only if Qn = Th - Tn_1, and hence T_1 = Qo + Qi + - - - + Qn_1. Continuing the process recursively backward, it is concluded that Tk Qo +Qi1+"-""-+ Qk, k =0, 1, ..., n. In particular, for the given polynomial P3, one has Qo = 1, Qi = 2x, Q2 = -xy + y2, and Q3 = 43 - J2x. Therefore, its Taylor polynomials about the origin are To = 1, T1 = To - 2x, T2 = T1 - cy + y2, and Tk P3 fork >3. D THEOREM 13.15. (Taylor Theorem). Let D be an open disk centered at ro and let the partial derivatives of a function f be continuous up to order n - 1 on D. Then f(r) T_1(r) + En(r), where the reminder e(r) satisfies the condition ln(r)| <; h (r)||r - rl1, where h (r) - 0 as r > ro. In Section 92.4, Taylor polynomials for functions of any number of variables will be defined. Theorem 13.15 is true, just as written, no matter how many variables there are; that is r= (ci, x2,..., ccm) for any number of variables mn. For nm 2, this theorem is nothing but Theorem 13.12. The continuity of partial derivatives ensures the existence of a good linear approximation L(r) =T1(r) in the sense that the difference f(r) - T1(r) decreases to 0 faster than |r - rol as r -a ro. For nm> 2, it states that the approximation of f by the Taylor  238 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS polynomial T_1 is a good approximation in the sense that the error decreases faster than ||r -ro||"-1. A practical significance of the Taylor theorem is that higher-order differentials of a function can be used to obtain successively better approximations of values of a function near a point if the function has continuous partial derivatives of higher orders in a neighborhood of that point. EXAMPLE 13.36. Let f (x, y) = /1 + x2y. Find df (1, 3) and d2f (1, 3) and use them to approximate f (1 + 0.1, 3 - 0.2). SOLUTION: Put (dx, dy) = (0.1, -0.2). It was found in Example 13.33 that df(1, 3) = 0.1. The second partial derivatives are obtained by the quotient rule (see f' and f' in Example 13.33): f (1,3) y(1 + x2y)1/2 - x2y2(1 + x2)-1/2 3 1 + x2y (1,3) 8' 2x(1 + x2y)1/2 - x3-(1 + x2)-1/2 5 J xyk'2(1 + x2y) (1,3) 16' f"(1,3) - (1 + x21-3/2 4 (1,3) 8 Therefore, d2f (1, 3) = f"(1,3)(d)2+2f"(1,3)dxdy+f"(1,3)(dy)2 3 5 ~ 1 =- (dx)2 + - dx dy - - (dy)2 = -0.01375. 8 8 8 The linear approximation is f (1 + dx, 3 + dy) f (1, 3) + df (1, 3) 2 + 0.1 = 2.1. The quadratic approximation is f (1 + dx, 3 + dy) f (1, 3) + df (1, 3) + }d2f (1, 3) = 2.1 - 0.01375/2 = 2.093125, while a calculator value of f(1 + dx, 3 + dy) is 2.094755 (rounded to the same significant digit). Evidently, the quadratic approximation (the approximation by the second-degree Taylor polynomial) is better than the linear approximation. D Yet, the Taylor theorem does not allow us to estimate the accuracy of the approximation because the function hn remains unknown. What is the order of approximation needed to obtain an error smaller than some prescribed value? COROLLARY 13.3. (Accuracy of Taylor Polynomial Approximations). If, in addition to the hypotheses of Theorem 13.15, the function f has partial derivatives of order n that are bounded on D, that is, there exist numbers Mak, k =1, 2, ..., n, such that |" f (r)/&-kzoky| <; Maj for  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 239 all r E D, then the remainder satisfies |n(r1 < j BnMnkX n-k yyolk k=o for all (x, y) E D, where B = n!/(k!(n- k)!) are binomial coefficients. Next, note I-zolc < ||r - roll and |y- yo l < |r - roll and hence Ic - zol"-c0 - yolk < r - roll". Making use of this inequality, one infers that Mn (13.12) ln(r)| "||r - roll", where the constant Mn L=EnO B4Mnk. In particular, for the linear approximation n = 2, M22 (13.13)I f (r) - L(r)| < ||r - rol2 where M2 = M20 + 2M11 + M02. The results (13.12) and (13.13) are to be compared with the similar results (13.10) and (13.11) in the one-variable case. If the second partial derivatives are continuous and bounded near ro, then variations of their values may be neglected in a sufficiently small neighborhood of ro, and the numbers M20, M1, and M02 may be approximated by the absolute values of the corresponding partial derivatives at ro so that lE2|1~ f"§(ro)|(dzc)2 +2|f"7,(ro)|dz dy + f"(ro)|(dy)2) for sufficiently small variations dc = x - co and dy = y - yo. Such an estimate is often sufficient for practical purposes to assess the accuracy of the linear approximation. This estimate works even better if f has continuous partial derivatives of the third order because the second partial derivatives would have a good linear approximation and varia- tions of their values near ro are of order ||dr||. Consequently, they can only produce variations of e2 of order ||dr||3, which can be neglected as compared to ||dr||2 for sufficiently small ||dr||. EXAMPLE 13.37. Find the linear approximation near (0, 0) of f (x, y) = V1 + x + 2y and assess its accuracy in the square Icc < 1/4, ly| < 1/4. SOLUTION: One has f'~ = (1 + cc + 2y)-1/2 and f'=(1+ cc + 2y)-1/2 so that f'(0, 0) =1/2 and f'(0, 0) =1. The linear approximation is T1(cc, y) =L(cc, y) =1 + cc/2 + y. The second partial derivatives are f =- (1+c+2y)-3/2, f_ _ -j1cc2 3/2, and f =-(1+cc+  240 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS 2y)-3/2. Their absolute values are maximal if the combination 1+x+2y is minimal on the square. Setting x = y = -1/4, 1/4 < 1 + x + 2y in the square. Therefore, If",| < M20= 2, f" < Mn = 4, and f"yl < M02 = 8. Thus, f(x, y) - L(x, y)| < x2 + Ixyl + 2y2 in the square. EXAMPLE 13.38. Use the linear approximation or the differential to estimate the amount of aluminum in a closed aluminum can with diameter 10 cm and height 10 cm if the aluminum is 0.05 cm thick. Assess the accuracy of the estimate. SoLUTIoN: The volume of a cylinder of radius r and height h is f(h, r) =whr2. The volume of a closed cylindrical shell (or the can) of thickness b is therefore V = f(h+28, r+b) - f(h, r), where h and r are the internal height and radius of the shell. Put dh = 26 = 0.1 and dr = S = 0.05. Then V df (10, 5). One has f =w7r2 and f' = 2whr; hence, df (10, 5)= ff(10, 5) dh + f'(10, 5) dr = 257 dh + 100w dr = 7.5w cm3. To assess the accuracy, note that f is a polynomial, and therefore all its partial derivatives of any order are continuous. In particular, f/'h = 0, f", = 2wr, and f',= 2wh. Since dh and dr are small compared to r = 5 and h = 10, the variations of second derivatives in the rectangle [5 - dr, 5 + dr] x [10 - dh, 10 + dh] may be neglected. Then le2| (M20(dh)2 + 2Mn1 dhdr| + Mo2(dr)2), where the estimates M20= f'h(lO, 5)|1= 0, M11=I fhr(10, 5) w = 10, and M02 =If(10, 5)|1= 20w can be used. So lE2| = 0.075w. The relative error is le2|/V = 0.01, or 1%. Q 92.4. Multivariable Taylor Polynomials. For more than two variables, Taylor polynomials are defined similarly. Let r = (xi, x2, ..., zgm) and let dr = (dxi, dx2, ..., dzm). Suppose that a function f has continuous partial derivatives up to order n. The n-th order differential of f(r) is defined by df (r) (dxi) +dx2 +---+dxm)m f(r), where the variables r and dr are viewed as independent when differen- tiating. The Taylor polynomial of degree n about a point ro is Tn(r) = f (ro) + +df (ro) + id2f(ro) + -..-+Idnf (ro), where dr =r - ro. The Taylor theorem has a natural extension to the multivariable case: f(r) - T,_1(r) + cn(r), where the remainder en(r) satisfies the condition (13.12). Taylor polynomials obey the recurrence relation T,(r) =T,_1(r) + 1-d~f(ro) .  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 241 So, for practical purposes, the error lenl of the approximation f~Tn_1 may be estimated by dmf (ro)/n!, where dzc are replaced by their abso- lute values |dx| and the values of partial derivatives are also replaced by their absolute values (just like it has been done when estimating le2| in the case of two variables), provided the partial derivatives of f of order n or higher are continuous in a neighborhood of ro. Calculation of higher-order derivatives to find Taylor polynomials might be a technically tedious problem. In some special cases, how- ever, it can be avoided. The concept is illustrated by the following example. EXAMPLE 13.39. Find T3 for the function f (x, y, z) = sin(xy + z) about the origin. SOLUTION: The Taylor polynomial T3 in question is a polynomial of degree 3 in x, y, and z, which is uniquely determined by the coefficients of monomials of degree less than or equal to 3. Put u = zy + z. The variable u is small near the origin. So the Taylor polynomial approxi- mation for f near the origin is determined by the Taylor polynomials for sin u about u = 0. The latter is obtained from the Maclaurin series sinu = u - ju3 + 5(u), where e5 contains only monomials of degree 5 and higher. Since the polynomial u vanishes at the origin, its powers u" may contain only monomials of degree n and higher. Therefore, T3 is obtained from 26 623 =(x + z) - 6(xy +z)3 = z + xy - 6 (z3 + 3(xy)z2 + 3(y)2z + (xy)3) by retaining in the latter all monomials up to degree 3, which yields T3(r) = z + xy - z3. Evidently, the procedure is far simpler than calculating 19 partial derivatives (up to the third order)! Q 92.5. Study Problems. Problem 13.10. Find T1, T2, and T3 for f(x, y, z) = (1 + zmy)/(1+ x + y2 + z3) about the origin. SOLUTION: The function f is a rational function. It is therefore suffi- cient to find a suitable Taylor polynomial for the function (1 + x + y2 + z3>-1 and then multiply it by the polynomial 1 + zty, retain- ing only monomials up to degree 3. Put Li= x + y2 + z3. Then (1+u)-1 =1 - Li+ L2 -_i 3+ - - (as a geometric series). Note that, for n ;> 4, the terms Li" contain only monomials of degree 4 and higher and  242 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS hence can be omitted. Up to degree 3, one has u2 = x2 + 2xy2 + - and u3 _x3 + - - - . Therefore, (1+xy)(1-u+u2 _3) _-(1+xy)(1-x-y2-z3 +x2+2xy2-x3+...). Carrying out the multiplication and arranging the monomials in the order of increasing degrees, one infers: T1(x, y, z) =1-x, T2(X, y, z) = T(x, y, z) + x2 + zy -y2 T3(X, y, z) = T2(x, y, z) - x3 - x 2y+2 y2 - z3. Problem 13.11. (Multivariable Taylor and Maclaurin Series) Suppose that a function f has continuous partial derivatives of any order and the reminder in the Taylor polynomial approximation f T_1 + En near ro converges to 0 as n- oc (i.e., En - 0). Then the function can be represented by the Taylor series about a point ro: f(r) = f(ro) + I'dmf(ro), n=1 where dr = r - ro. The Taylor series about ro = 0 is called the Maclau- rin series. Find the Maclaurin series of sin(xy2). SOLUTION: Since the argument of the sine is the polynomial zy2, the Maclaurin series of f can be obtained from the Maclaurin series of sin u by setting u = zy2 in it. From Calculus II, f (r) = sin126= \ l0 2n-1(=<±0022-14 D(2n - 1)! (2n - 1)! 92.6. Exercises. (1) Find the differential df of each of the following functions: (i) f (X, y) =X3 + y3 - 3xy(x - y) (ii) f(x, y)= y cos(cx2y) (iii) f(cc, y) =sin(c2 + y2) (iv) f (cc,y, z) =xc+ yz +yexYz (v) f(cc, y, z) =ln(zcxyuzz) (vi) f (cc,y, z) = y/(1 +ccyz) (vii) f (r) - fa2 - |r||2, where a is a const ant and r =(cci, cc2, ..., ccm)  92. THE DIFFERENTIAL AND TAYLOR POLYNOMIALS 243 (2) Four positive numbers, each less than 100, are rounded and then multiplied together. Use differentials to estimate the maximum possible error in the computed product that might result from the rounding. (3) A boundary stripe 10 cm wide is painted around a rectangle whose dimensions are 50 m by 100 m. Use differentials to approximate the number of square meters of paint in the stripe. Assess the accuracy of the approximation. (4) A rectangle has sides of x = 6 m and y = 8 m. Use differentials to estimate the change of the length of the diagonal and the area of the rectangle if x is increased by 2 cm and y is decreased by 5 cm. Assess the accuracy of the estimates. (5) Consider a sector of a disk with radius R = 20 cm and the angle O = r/3. Use the differential to determine how much the radius should be decreased in order for the area of the sector to remain the same when the angle is increased by 10. Assess the accuracy of the estimate. (6) Let the quantities f and g be measured with relative errors Rf and Rg. Show that the relative error of the product fg is the sum Rf + Rg. (7) Measurements of the radius r and the height h of a cylinder are r = 2.2 + 0.1 and h = 3.1 + 0.2, in meters. Find the absolute and relative errors of the volume of the cylinder calculated from these data. (8) The adjacent sides of a triangle have lengths a = 100 + 2 and b = 200 + 5, in meters, and the angle between them is 0 = 600 + 10. Find the relative and absolute errors in calculation of the length of the third side of the triangle. (9) If R is the total resistance of n resistors, connected in parallel, with resistances Rj, j = 1, 2, ..., n, then R-1 = R1-1 + R21 + - - - + R;;1. If each resistance R3 is known with a relative error of 0.5%, what is the relative error of R? (10) Use the Taylor theorem to assess the maximal error of the linear approximation of the following functions about the origin in the ball of radius R (i.e., for r 0, that is at the distance h from ro in the direction of 6. So the slope is given by the derivative F'(0). Therefore, the following definition is natural. DEFINITION 13.23. (Directional Derivative). Let f be a function on an open set D. The directional derivative of f at ro E D in the direction of a unit vector 6 is the limit Duf(ro)himo f(ro + hn) - f(ro) h~Oo h if the limit exists. The number Duf(ro) is the rate of change of f at ro in the direc- tion of 6. Suppose that f is a differentiable function. By definition, Duf (ro) = df (r(h))/dh taken at h = 0, where r(h) = ro + h6. So, by the chain rule, df (r(h))-f1(r(h))x(h) + fx2(r(h))x2(h) +... + fxm(r(h))4i(h) dh Setting h = 0 in this relation and taking into account that r'(h) = 6 or x'(h) = u, where u= (u1, t2, ..., um), one infers that (13.14) Duf (ro)= f'1(ro)ui + f'2(ro)u2 + ... + f'm(ro)um . Remark. If f has partial derivatives at ro, but is not differentiable at ro, then the relation (13.14) is false. An example is given in Study Problem 13.12. Note that (13.14) follows from the chain rule, but the mere existence of partial derivatives is not sufficient for the chain rule to hold. Furthermore, even if a function has directional derivatives at a point in every direction, it may not be differentiable at that point (no good linear approximation exists at that point). Equation (13.14) provides a convenient way to compute the direc- tional derivative if f is differentiable. Recall also that if the direction is specified by a nonunit vector u, then the corresponding unit vector can be obtained by dividing it by its length |uthat is, 6 = u/llull. EXAMPLE 13.40. The height of a hill is f (n, y) =9(9 - 3c2 -_ 21/2 where the cc and y axes are directed from west to east and from south to north, respectively. A hiker is at the point ro0 (1, 2). Suppose the hiker is facing in the northwest direction. What is the slope the hiker sees?  93. DIRECTIONAL DERIVATIVE AND THE GRADIENT 247 SOLUTION: A unit vector in the plane can always be written in the form u = (cos p, sin p), where the angle c is counted counterclockwise from the positive x axis; that is, p = 0 corresponds to the east direction, wp = r/2 to the north direction, c = to the west direction, and so on. So, for the north-west direction, p = 37r/2 and n = (-1/2, 1/v2) (u1, u2). The partial derivatives are f =-3x/(9 - 3x2 _ y2)1/2 and f', = -y/(9 - 3x2 _ y2)1/2. Their values at ro = (1, 2) are f'(1, 2) -3/V 2and f'(1, 2) -2/v2. By (13.14), the slope is Duf(ro) f'(ro)ui + f(ro)u2 = 3/2 - 1/2 = 1. If the hiker goes northwest, he has to climb at an angle of 450 relative to the horizon. D EXAMPLE 13.41. Find the directional derivative of f (x, y, z) x2 + 3xz + z2y at the point (1, 1, -1) in the direction toward the point (3, -1,0). Does the function increase or decrease in this direction? SOLUTION: Put ro = (1, 1, -1) and r1 = (3, -1, 0). Then the vector u = r1 - ro = (2, -2, 1) points from the point ro toward the point r1 according to the rules of vector algebra. But it is not a unit vector because its length is ||ull = 3. So the unit vector in the same direction is n = u/3 = (2/3, -2/3, 1/3) = (u1,u2,u3). The partial derivatives are f' = 2x + 3z, f' = z2, and f' = 3x + 2zy. Their values at ro read f'(ro)= -1, f(ro) = 1, and f'(ro) = 1. By (13.14), the directional derivative is Df (ro)= f'(ro)ui + f'(ro)u2 + f(ro)u3= -2/3 - 2/3 + 1/3 = -1. Since the directional derivative is negative, the function decreases at ro in the direction toward r1 (the rate of change is negative in that direction). D 93.2. The Gradient and Its Geometrical Significance. DEFINITION 13.24. (The Gradient). Let f be a differentiable function of several variables r =(zc1, X2, ..., cgm) on an open set D and let ro E D. The vector whose components are partial derivatives of f at ro, is called the gradient of f at the point ro. So, for two-variable functions f(cc, y), the gradient is Vf =(f', f'); for three-variable functions f(cc, y, z), the gradient is Vf =(f', f', f'); and so on. Comparing (13.14) with the definition of the gradient and  248 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS recalling the definition of the dot product, the directional derivative can now be written in the compact form (13.15) Duf (ro) = Vf (ro) . n . This equation is the most suitable for analyzing the significance of the gradient. Consider first the cases of two- and three-variable functions. The gradient is a vector in either a plane or space, respectively. In Example 13.40, the gradient at (1, 2) is Vf(1, 2) = (-3/v2, -2/v2). In Exam- ple 13.41, the gradient at (1, 1, -1) is Vf(1, 1, -1) = (-1, 1, 1). Recall the geometrical property of the dot product a. b = a|b cos 0, where O E [0, 7] is the angle between the nonzero vectors a and b. The value O = 0 corresponds to parallel vectors a and b. When 0 = r/2, the vectors are orthogonal. The vectors point in the opposite directions if 0 = r. Assume that Vf(ro) $ 0. Let 0 be the angle between the gradient Vf(ro) and the unit vector n. Then (13.16) Duf (ro) = Vf (ro) . n= ||Vf (ro)|| u cos 0= ||Vf (ro) cos 0 because ||n|| = 1 (the unit vector). As the components of the gradient are fixed numbers (the values of the partial derivatives at a particular point ro), the directional derivative at ro varies only if the vector u changes. Thus, the rates of change of f in all directions that have the same angle 0 with the gradient are the same. In the two-variable case, only two such directions are possible if n is not parallel to the gradient, while in the three-variable case the rays from ro in all such directions form a cone whose axis is along the gradient as depicted in the left and right panels of Figure 13.10, respectively. It is then concluded that the Y Vf Vf z u2z 0 Po 0 P 61 Po xx FIGURE 13.10. Left: The same rate of change of a function of two variables at a point Po occurs in two directions that have the same angle with the gradient Vf(Po). Right: The same rate of change of a function of three variables at a point Po occurs in infinitely many directions that have the same angle with the gradient Vf(Po).  93. DIRECTIONAL DERIVATIVE AND THE GRADIENT 249 k f(x,y,z) = k FIGURE 13.11. Left: The gradient at a point P is normal to a level curve f(x, y) = k through P of a function f of two variables. Middle: A curve C of steepest descent or ascent for a function f has the characteristic property that the gradient Vf is tangent to it. The level curves (surfaces) of f are normal to C. The function f increases most rapidly along C in the direction of Vf, and f decreases most rapidly along C in the opposite direction -Vf. Right: The gradient of a function of three variables is normal to any curve through P in the level surface f(x, y, z) = k. So Vf(P) is a normal to the tangent plane through P to the level surface. rate of change of f attains its absolute maximum or minimum when cos 0 does. Therefore, the maximal rate is attained in the direction of the gradient (0 = 0) and is equal to the magnitude of the gradient Vf(ro) |, whereas the minimal rate of change -|V f(ro) occurs in the direction of -V f(ro), that is, opposite to the gradient (0 = 7). The graph of a function of two variables z = f(x, y) may be viewed as the shape of a hill. Then the gradient at a particular point shows the direction of the steepest ascent, while its opposite points in the direction of the steepest descent. In Example 13.40, the maximal slope at the point (1, 2) is |Vf (ro) = (1/v2) |(-3, 2)|| = 13/2. It occurs in the direction of (-3/ 2, 2/ 2) or (-3, 2) (the multiplication of a vector by a positive constant does not change its direction). If o is the angle between the positive x axis (or the vector ei) and the gradient, then tan y = -2/3 or o r 1460. If the hiker goes in this direction, he has to climb up at an angle of tan-1( 13/2) r 69° with the horizon. Also, note the hiker's original direction was o = 1350, which makes the angle 110 with the direction of the steepest ascent. So the slope in the direction o = 1460 + 11 = 1570 has the same slope as the hiker's original one. As has been argued, in the two-variable case, there can only be two directions with the same slope. Next, consider a level curve f(x, y) = k of a differentiable function of two variables. Suppose that there is a differentiable vector function r(t) = (x(t), y(t)) that traces out the level curve. This vector function  250 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS should satisfy the condition that f(x(t), y(t)) = k for all values of the parameter t. By the definition of level curves, the function f has a constant value k along its level curve. Therefore, by the chain rule, d df f(z(t),y(t))= 0 -- f''(t)+fy'(t) = Vf (r(t))-r'(t) = 0 dt dt for any value of t. For any particular value t = to, the point ro = r(to) lies on the level curve, while the derivative r'(to) is a tangent vector to the curve at the point ro. Thus, the gradient Vf(ro) is orthogonal to a tangent vector at the point ro to the level curve of f through that point. This is often expressed by saying that the gradient of f is always normal to smooth level curves of f. The existence of a smooth vector function that traverses a level curve of f can be established by the implicit function theorem. Suppose that f has continuous partial derivatives and Vf (ro) / 0 at a point ro = (x, yo) on the level surface f(x, y) = k. The level surface can also be defined as the set of roots of the function F(x, y) = f(x, y) - k, that is, the set of solution of F(x, y) = 0. In particular, the point ro is a root, F(xo, yo) = 0. The function F has continuous partial derivatives and VF(ro) = Vf(ro) / 0. The components of the gradient do not vanish simultaneously, and without loss of generality, one can assume that F'(ro)= f'(ro) / 0. Then, by the implicit function theorem, there is a function y = g(x) such that F(x, g(x)) = 0 in some open interval containing xo; that is, the graph y = g(x) coincides with the level curve f(x, y) = k in a neighborhood of (zo, yo), where yo = g(xo). Furthermore, the derivative g'(x) = -F'/F' -f'/f' exists in that interval. Hence, the vector function r(t) (t, g(t)) traverses the graph y = g(x) and the level curve near ro = r(to), where to = zo. It is smooth because r'(t) = (1, g'(t)) so that r'(t) / 0. Recall that a function f(x, y) can be described by a contour map, which is a collection of level curves. If level curves are smooth enough to have tangent vectors everywhere, then one can define a curve through a particular point that is normal to all level curves in some neighborhood of that point. This curve is called the curve of steepest descent or ascent through that point. The tangent vector of this curve at any point is parallel to the gradient at that point. The values of the function increase (or decrease) most rapidly along this curve. If a hiker follows the direction of the gradient of the height, he would go along the path of steepest ascent or descent. The case of functions of three variables can be analyzed along sim- ilar lines. Let a function f~x, y, z) have continuous partial derivatives  93. DIRECTIONAL DERIVATIVE AND THE GRADIENT 251 in a neighborhood of ro = (zo, yo, zo) such that Vf(ro) / 0. Con- sider a level surface of f through ro, f(x, y, z) = k, which is a set of roots of the functions F(x, y, z) = f(x, y, z) - k. Since the components of the gradient VF(ro) = Vf(ro) do not vanish simultaneously, one can assume that, say, F'(ro) = f'(ro) / 0. By the implicit function theorem, there is a function g(x, y) such that the graph z = g(x, y) co- incides with the level surface near ro; that is, F(x, y, g(x, y)) = 0 for all (x, y) near (zo, yo). The function g has continuous partial derivatives, and its linearization at (z0, yo) defines a plane tangent to the graph z = g(x, y) at ro, where zo = g(zo, yo). A normal of the tangent plane is n = (g', g',, -1), where the derivatives are taken at the point (zo, yo). Using g' = -F'/F' = -f'/f' and g' = -F'/F'= -f'/f', where the derivatives of f are taken at ro, it follows that n =- f'(ro),f'(ro), f'r) f(ro= - f Vf (ro). f'ro f(ro)Vr) Thus, the gradient Vf(ro) is proportional to n and hence is also normal to the tangent plane. Furthermore, if r(t) = (x(t), y(t), z(t)) is a smooth curve on the level surface, that is, f(r(t)) = k for all values of t, then df/dt = 0 (the values of f do not change along the curve), and, by the chain rule, df /dt = f'x' + f'y' + fz'= Vf (r(t)) - r'(t) = 0. So the gradient Vf(ro) is orthogonal to a tangent vector to any such curve through ro, and all tangent vectors to curves through ro lie in the tangent plane to the level surface through ro. All these findings are summarized in the following theorem. THEOREM 13.16. (Geometrical Properties of the Gradient). Let f be differentiable at ro. Let S be the level set through the point ro, and assume Vf (ro) / 0. Then (1) The maximal rate of change of f at ro occurs in the direction of the gradient Vf (ro) and is equal to its magnitude ||Vf (ro)||. (2) The minimal rate of change of f at ro occurs in the direction opposite to the gradient -Vf(ro) and equals -|Vf(ro) (3) If f has continuous partial derivatives on an open ball D con- taining ro, then the portion of S inside D is a smooth surface (or curve), and Vf is normal to S at ro. E XAMPLE 13.42. Find an equation of the tangent plane to the el- lipsoidcx2+ 2y2+ -3z2 =11 at the point (2,1, 1). SOLUTION: The equation of the ellipsoid can be viewed as the level sur- face f (x, y, z) =11 of the function f (x, y, z) =cx2 + 2y2 + 3z2 through  252 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS the point ro = (2, 1, 1) because f(2, 1, 1) = 11. By the geometrical property of the gradient, the vector n = Vf(ro) is normal to the plane in question because the components of Vf = (2x, 4y, 6z) are contin- uous. One has n = (4, 4, 6). An equation of the plane through the point (2, 1, 1) and normal to n is 4(x - 2) + 4(y - 1) + 6(z - 1) = 0 or 2x+2y+3z= 9. D Theorem 13.16 holds for functions of more than three variables as well. Equation (13.15) was obtained for any number of variables, and the representation of the dot product (13.16) holds in any Euclidean space. Thus, the first two properties of the gradient are valid in any multivariable case. The third property is harder to visualize as the level surface of a function of m variables is an (m - 1)-dimensional surface embedded in an m-dimensional Euclidean space. To this end, it is only noted that if r(t) is a smooth curve in the level set f(r) = k of a differentiable function, then f has a constant value along any such curve, and, by the chain rule, it follows that df(r(t))/dt =Vf(r(t)) r'(t) = 0 for any t. At any particular point ro = r(to), all tangent vectors r'(to) to such curves through ro are orthogonal to a single vector Vf(ro). Intuitively, these vectors should form an (m - 1)-dimensional Euclidean space (called a tangent space to the level surface at ro), just like all vectors in a plane in three-dimensional Euclidean space are orthogonal to a normal of the plane. Remark. The gradient can be viewed as the result of the action of the operator V = (a/ax1, /&x2,.... , &/&Xm) on a function f Vf is understood in the sense of multiplication of the "vector" V by a scalar f. With this notation, the differential operator d has a compact form d = dr"- V. The linearization L(r) of f(r) at ro and the differentials of f also have a simple form for any number of variables: L(r) = f(ro) + Vf(ro) - (r - ro) , df(r) = (dr. V)"f(r). 93.3. Study Problems. Problem 13.12. (Differentiability and Directional Derivative). Let f (x, y) = y3/(x2 + y2) if (x, y) / (0, 0) and f (0, 0) = 0. Show that Duf(0, 0) exists for any 6, but it is not given by the relation (13.14). Show that this function is not differentiable at (0, 0). Thus, the exis- tence of all directional derivatives at a point does not imply differentia- bility at that point. In other words, despite that the function has a rate of change at a point in every direction, a good linear approximation may not exist at that point.  93. DIRECTIONAL DERIVATIVE AND THE GRADIENT 253 SOLUTION: Put n = (cos 0, sin 0) for 0 < 0 < 27. By the definition of the directional derivative, f(h cos 0, h sin 0) - f(0, 0) __h3 sin3 0 Duf(0, 0) = im=lin sin3 0 . h0h h-O h3 In particular, for 0 = 0, u= (1, 0), and Duf(0, 0) = f'(0, 0) = 0; similarly for w= /2, n = (0, 1), and Duf(0, 0) = f'(0, 0) =1. If the relation (13.14) were used, one would have found that Duf(0, 0) sin 0, which contradicts the above result. If a good linear approximation exists, then it should be L(x, y) = f'(0, 0)x + f'(0, 0)y = y. But (f(x, y) - L(x, y))/(x2 + y2)1/2 = -yx2/(x2 + y2)3/2 does not vanish as (x, y) -- (0, 0) because it has a nonzero constant value along any straight line (x, y) = (t, at), a / 0. So f is not differentiable at the origin. D Problem 13.13. Suppose that three level surfaces f(x, y, z) = 1, g(x, y, z) = 2, and h(x, y, z) = 3 are intersecting along a smooth curve C. Let P be a point on C in whose neighborhood f, g, and h have continuous partial derivatives and their gradients do not vanish at P. Find Vf - (Vg x Vh) at P. SOLUTION: Let v be a tangent vector to C at the point P (it exists because the curve is smooth). Since C lies in the surface f(x, y, z) = 1, the gradient Vf(P) is orthogonal to v. Similarly, the gradients Vg(P) and Vh(P) must be orthogonal to v. Therefore, all the gradients must be in a plane perpendicular to the vector v. The triple product for any three coplanar vectors vanishes, and hence Vf - (Vg x Vh) = 0 at P. Q Problem 13.14. (Energy Conservation in Mechanics). Consider Newton's second law ma = F. Suppose that the force is the gradient F = -VU, where U = U(r). Let r = r(t) be the trajectory satisfying Newton's law. Prove that the quantity E = mv2/2 + U(r), where v =|r'(t) is the speed, is a constant of motion, that is, dE/dt = 0. This constant is called the total energy of a particle. SOLUTION: First, note that v2 = v - v. Hence, (v2)' = 2v - v' = 2v - a. Using the chain rule, dU/dt = U'z'(t) + U'y'(t) + U'z'(t) = r'- VU =v - VU. It follows from these two relations that d E mn2,dU - -(v2)'+ = mv -a+ v.-VU =v-(ma -F) =0. dt 2 dt So the total energy is conserved for the trajectory of the motion. D  254 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS 93.4. Exercises. (1) Let f(x, y) be a differentiable function. How would you specify the directions at a particular point in which the function does not change at all? How many such directions exist if the first partial derivatives do not vanish at that point? Answer the same questions for a function f (x, y, z). (2) For each of the following functions, find the gradient and the direc- tional derivative at a specified point in the direction parallel to a given vector v. Indicate whether the function increases or decreases in that direction. (i) fv(x, y) = x2y, (1, 2), v= (4, 5) (ii) fv(x, y) = x/(1 + my), (1, 1), v= (2, 1) (iii) f (x, y, z) =X 2yJ - zyJ2 + zz2, (1, 2, -1), y = (1, -2, 2) (iv) f (x, y, z) = tan-1 (1 + X + y2 + z3)(1-11,v (v) f (x, y, z) = x/ + yz, (1, 1, 3), v=(2,6,3) (vi) f(x, y, z) = (x + y)/z, (2, 1, 1), v= (2, -1, -2) (3) Find the maximal and minimal rates of change of each of the fol- lowing functions at a specified point and the directions in which they occur. Find the directions in which the function does not change. (i) f (X, y) = z/y2, (2, 1) (ii) f(x,y) = xcv, (2, 1) (iii) fv(x, y, z) = zz/(1 + yz), (1, 2, 3) (iv) f (x, y, z) = x sin(yz), (1, 2, 7r/3) (v) f (x, y, z) = xyz , (2, 2, 1) (4) Let fv(x, y) = y/(1 + X2 + y). Find all unit vectors n along which the rate of change of f at (2, -3) is a number -1 < p < 1 times the maximal rate of change of f at (2, -3). (5) For the function f(x, y, z) =jx2-y2x+z3y at the point Po(1, 2, -1) find: (i) The maximal rate of change of f and the direction in which it occurs; (ii) A direction in which the rate of change is half of the maximal rate of change. How many such directions exist? (iii) The rate of change in the direction to the point P1(3, 1, 1) (6) If f and u are differentiable functions, prove that Vf (u) = f'(u)Vu. (7) Find Vll x r||2, where e is a constant vector. (8) If f, tt, and v are differentiable functions, prove that Vf~u, v)= f'Vt+ f'Vv. (9) Find the directional derivative of f(r) =(cc/a)2 +(y/b)2 +(z/c)2 at a point r in the direction of r. Find the points at which this derivative is equal to |Vf||  93. DIRECTIONAL DERIVATIVE AND THE GRADIENT 255 (10) Find the angle between the gradients of f =z/(x2 + y2 + z2) at the points (1, 2, 2) and (-3, 1, 0). (11) Let f = z/ /z2 + y2 + z2. Sketch the level surfaces of f and |Vf |. What is the significance of the level surfaces of ||Vf||? Find the maximal and minimal values of f and ||Vf|| in the region 1 < z < 2. (12) Let a curve C be defined as the intersection of the plane sin B(x - zo) - cosO(y - yo) = 0, where 0 is a parameter, and the graph z= f(x, y), where f is differentiable. Find tan a, where a is the angle between the tangent line to C at (zo, Yo, f(zo, Yo)) and the zy plane. (13) Consider the function f (x, y, z) = 2 z + zy and three points P0(1, 2, 2), P1(-1, 4, 1), and P2(-2, -2, 2). In which direction does f change faster at Po, toward P1 or toward P2? What is the direction in which f increases most rapidly at Po? (14) For the function f (x, y, z) =czy+zy+zz at the point P0(1, -1, 0), find: (i) The maximal rate of change (ii) The rate of change in the direction v = (-1, 2, -2) (iii) The angle 0 between v and the direction in which the maximal rate of f occurs (15) Let f (x, y, z) =z/(x 2 + y2 + z2)1/2. Find the rate of change of f in the direction of the tangent vector to the curve r(t) = (t, 2t2, -2t2) at the point (1, 2, -2). (16) Find the points at which the gradient of f = x3_+ y3 + z3 - 3xyz is (i) Orthogonal to the z axis (ii) Parallel to the z axis (iii) Zero (17) Let f =ln|r - ro, where ro is a fixed vector. Find points in space where |Vf||= 1. (18) For each of the following surfaces, find the tangent plane and the normal line at a specified point: (i) x2 + y2 + z2 = 169, (3, 4, 12) (ii) X2 - 2y2 + z2 + yz= 2, (2, 1, -1) (iii) x = tan-1(y/z), (7/4, 1, 1) (iv) z = y + ln(z/z), (1, 1, 1) (v) 2x/z +2Y/z =-8, (2, 2, 1) (19) Find the points of the surface x2+2y2+3z2+2xcy+2zzc+4yz =8 at which the tangent planes are parallel to the coordinate planes. (20) Find the tangent planes to the surface x2+ 2y2 + 3z2 =21 that are parallel to the plane cc + 4y + 6z =0.  256 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (21) Find the points on the ellipsoid x2/a2+y2/b2+z2/c2 = 1 at which the normal line makes equal angles with the coordinate axes. (22) Consider the paraboloid z = x2+ y2. (i) Give the parametric equations of the normal line through a point Po(xo, yo, zo) on the paraboloid; (ii) Consider all normal lines through points with a fixed value of zo (say, zo = 2). Show that all such lines intersect at one single point that lies on the z axis and find the coordinates of this point. (23) Find the points on the hyperboloid x2 - y2 + 2z2 = 5, where the normal line is parallel to the line that joins the points (3, -1, 0) and (5, 3, 8). (24) Find an equation of the plane tangent to the surface x2+y2-4z2 1 at a generic point (X, yo, zo) of the surface. (25) Find the rate of change of the function h(x, y) = /10 - x2y2 at the point Po(1, 1) in the direction toward the point P(-2, 5). Let h(x, y) be the height in a neighborhood of Po. Would you be climbing up or getting down when you go from Po toward P? (26) Your Mars rover is caught on the slope of a mountains by a dust storm. The visibility is 0. Your current position is Po(1, 2). You can escape in the direction of a cave located at P1(4, -2) or in the direction of the base located at P2(17, 14). Which way would you drive to avoid steep climbing or descending if the height in the area can be approximated by the function h(x, y) = zy +x2? (27) You are flying a small aircraft on the planet Weirdo. You have disturbed a nest of nasty everything-eating bugs. The onboard radar indicates that the concentration of the bugs is C(x, y, z) = 100 - x2 - 2y2 - 3z2 and C(x, y, z) = 0 if x2 + 2y2 + 3z2 > 100. If your current position is (2, 3, 1), in which direction would you fire a mass-destruction microwave laser to kill as many poor bugs as possible near you? Find the optimal escape trajectory. (28) Let two level curves f (x, y) = 0 and g(x, y) = 0 of functions f and g, whose partial derivatives are continuous, intersect at some point Po. The rate of change of the function f at Po along the curve g(x, y) = 0 is half of its maximal rate of change at Po. What is the angle at which the curves intersect (the angle between the tangent lines)? (29) Suppose that the directional derivatives Duf =-a and D f =b of a differentiable function f(x, y) are known at a particular point Po for two unit nonparallel vectors 6i and v that make the angles 0 and # with the x axis counted counterclockwise from the latter, respectively. Find the gradient of f at Po. (30) Three tests of drilling into rock along the directions u =(1, 2, 2), v (0, 4, 3), and w =(0, 0, 1) showed that the gold concentration  94. MAXIMUM AND MINIMUM VALUES 257 increases at the rates 3 g/m, 3 g/m, and 1 g/m, respectively. Assume that the concentration is a differentiable function. In what direction would you drill to maximize the gold yield and at what rate does the gold concentration increase in that direction? If the concentration is not differentiable, would you follow your previous finding about the drilling direction? Explain. (31) A level surface of a differentiable function f(x, y, z) contains the curves ri(t) = (2+3t, 1-t2, 3-4t+t2) and r2(t) = (1+t2, 2t3-1, 2t+1). Can this information be used to find the tangent plane to the surface at (2, 1, 3)? If so, find an equation of the plane. (32) Prove that tangent planes to the surface xyz = a3> 0 and the coordinate planes form tetrahedrons of equal volumes. (33) Prove the total length of intervals from the origin to the points of intersection of tangent planes to the surface x/ + fy + iz = fa, a > 0, with the coordinate axes is constant. (34) Two surfaces are called orthogonal at a point of intersection if the normal lines to the surfaces at that point are orthogonal. Show that the surfaces x2 + y2 + z2 =r2, x2 + y2 = z2 tan2 g, and y cosOB = x sinOB are pairwise orthogonal at their points of intersection for any values of the constants r > 0, 0 < # < 7, and 0 < < 27. (35) Find the directional derivative of f(x, y, z) in the direction of the gradient of g(x, y, z). What is the geometrical significance of this derivative? (36) Find the angle at which the cylinder x2+ y2 = a2 intersects the surface bz = zy at a generic point of intersection (zo, Yo, zo). (37) A ray of light reflects from a mirrored surface at a point P just as it would reflect from the mirrored plane tangent to the surface at P (if the light travels along a vector u, then the reflected light travels along a vector obtained from u by reversing the direction of the component parallel to the normal to the surface). Show that the light coming from the top of the z axis and parallel to it will be focused by the parabolic mirror az = x2 + y2, a > 0, to a single point. Find its coordinates. This property of parabolic mirrors is used to design telescopes. (38) Let fv(x, y) = y if y x2 and fv(x, y) = 0 if y = x2. Find D f(0, 0) for all unit vectors n. Show that f(x, y) is not differentiable at (0, 0). Is the function continuous at (0, 0)? 94. Maximum and Minimum Values 94.1. Critical Points of Multivariable Functions. The positions of the lo- cal maxima and minima of a one-variable function play an important role when analyzing its overall behavior. In Calculus I, it was shown  258 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS how the derivatives can be used to find local maxima and minima. Here this analysis is extended to multivariable functions. The following notation will be used. An open ball of radius S cen- tered at a point ro is denoted B8 = {r|||r -roll < 8}; that is, it is a set of points whose distance from ro is less than b > 0. A neighborhood N8 of a point ro in a set D is a set of common points of D and Bb; that is, N8 = D 0 B8 contains all points in D whose distance from ro is less than 5. DEFINITION 13.25. (Absolute and Local Maxima or Minima). A function f on a set D is said to have a local maximum at ro E D if there is a neighborhood N8 of ro such that f(ro) ;> f(r) for all r E N8. The number f(ro) is called a local maximum value. If there is a neighborhood N8 of ro such that f(ro) < f(r) for all r E N8, then f is said to have a local minimum at ro, and the number f(ro) is called a local minimum value. If the inequality f(ro) ;> f(r) or f(ro) f(r) holds for all points r in the domain of f, then f has an absolute maximum or absolute minimum at ro, respectively. Minimal and maximal values are also called extremum values. In the one-variable case, Fermat's theorem asserts that if a differentiable function has a local extremum at zo, then its derivative vanishes at zo. The tangent line to the graph of f at zo is horizontal: y = f(zo) + df(zo) = f(zo) + f'(zo) d = f(zo). There is an extension of Fermat's theorem. THEOREM 13.17. (Necessary Condition for a Local Extremum) If a differentiable function f has a local extremum at an interior point ro of its domain D, then df (ro) = 0 or Vf (ro) = 0 (all partial deriva- tives of f vanish at ro). PROOF. Consider a smooth curve r(t) through the point ro such that r(to) = ro. Then dr(to) = r'(to) dt / 0 (the curve is smooth and hence has a nonzero tangent vector). The function F(t) = f(r(t)) defines the values of f along the curve. Therefore, F(t) must have a local extremum at t = to. Since f is differentiable, the differential dF(to) = F'(to) dt exists by the chain rule: dF(to) = dr(to) . Vf (ro). By Fermat's theorem dF(to) = 0 and hence dr(to) - Vf (ro) = 0. This relation means that the vectors dr(to) and Vf(ro) are orthogonal for all smooth curves through ro. The only vector that is orthogonal to any vector is the zero vector, and the conclusion of the theorem follows: Vf (ro) =0. In particular, for a differentiable function f of two variables, this theorem states that the tangent plane to the graph of f at a local extremum is horizontal.  94. MAXIMUM AND MINIMUM VALUES 259 The converse of this theorem is not true. Let f(x, y) = xy. It is differentiable everywhere, and its partial derivatives are f' = y and f' = x. They vanish at the origin, Vf (0, 0) = 0. However, the function has neither a local maximum nor a local minimum. Indeed, consider a straight line through the origin, x = at, y = bt. Then the values of f along the line are F(t) = f(x(t), y(t)) = abt2. So F(t) has a minimum at t = 0 if ab > 0 or a maximum if ab < 0. Each case is possible. For example, if a = b = 1, then ab = 1 > 0; if a = -b = 1, then ab = -1 < 0. Thus, f cannot have a local extremum at (0, 0). The graph z = xy is a hyperbolic paraboloid rotated through an angle r/4 about the z axis (see Example 11.29). It looks like a saddle. If the graph of f(x, y) has a horizontal tangent plane at (xo, yo) and looks like a hyperbolic paraboloid in a small neighborhood of (xo, yo), then the point (xo, yo) is called a saddle point (see Figure 13.12, right panel). Remark. The above analysis might make the impression that if Vf (ro) 0 and the values of f along any straight line through ro have a lo- cal extremum (i.e., F(t) = f(ro + vt) has either a local maximum or a local minimum at t = 0 for all vectors v), then f has a local ex- tremum at ro. This conjecture is false! An example is given in Study Problem 13.16. A local extremum may occur at a point at which the function is not differentiable. For example, f(x, y) = x + |yl is continuous everywhere z = f(x,y) z = f(xy) z = f(x,y) Yo i/J Y___ _ xo x....... ... N / ..... .... N xo .........6 FIGURE 13.12. Left: The graph z = f(x, y) near a local minimum of f. The values of f are no less than f(xo, yo) for all (x, y) in a sufficiently small neighborhood N6 of (xo, yo). Middle: The graph z = f(x, y) near a local maximum of f. The values of f do not exceed f(xo, yo) for all (x, y) in a sufficiently small neighborhood NS of (xo, yo). Right: The graph z = f(x, y) near a saddle point of f. In a sufficiently small neighborhood NS of (xo, yo), the values of f have a local maximum along some lines through (xo, yo) and a local minimum along the other lines through (xo, yo).  260 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS and has an absolute minimum at (0, 0). However, the partial derivatives f'(0, 0) and f'(0, 0) do not exist (e.g., f'(x, y) (x|)', which is 1 if x > 0 and is -1 if x < 0, so f'(0, y) does not exist and neither does f'(0, 0)). DEFINITION 13.26. (Critical Points). An interior point ro of the domain of a function f is said to be a critical point of f if either V f(ro) = 0 or the gradient does not exist at ro. Thus, if f has a local maximum or minimum at ro, then ro is a critical point of f. However, not all critical points correspond to either a local maximum or a local minimum. 94.2. Concavity. Recall from Calculus I that if the graph of a function f (x) lies above all its tangent lines in an interval I, then f is concave upward on I. If the graph lies below all its tangent lines in I, then f is concave downward on I. Furthermore, if f'(xo) = 0 (the tangent line is horizontal at xo) and f is concave upward in small open interval I containing x0, then f(x) > f(xo) for all x ox0 in I, and hence f has a local maximum. Similarly, f has a local minimum at x0, where f'(xo) = 0, if it is concave downward in a neighborhood of x0. If the function f is twice differentiable on I, then it is concave upward if f"(x) > 0 on I and it is concave downward if f"(x) < 0 on I. The concavity test can be restated in the form of the second-order differential d2f (x) = f"(x)(dx)2, which is a function of two independent variables x and dx. If d2f(x) > 0 for dx / 0, f is concave upward; if d2f(x) < 0 for dx / 0, f is concave downward. Suppose that f'(xo) - 0, f"(xo) / 0, and f"is continuous at x0. The continuity of f" ensures that d2f(x) has the same sign as d2f(xo) for all x near x0 and all dx / 0. Hence, the graph of f has a fixed concavity in a neighborhood of x0. Thus, if d2f(xo) < 0 (dx / 0), then f has a local maximum at x0; if d2f(xo) > 0 (dx / 0), then f has a local minimum at x0. It turns out that this sufficient condition for a function to have a local extremum has a natural extension to functions of several variables. THEOREM 13.18. (Sufficient Condition for a Local Extremum). Suppose that a function f has continuous second partial derivatives in an open ball containing a point ro and Vf (ro) = 0. Then f has a local maximum at ro if d2f (ro) <0, f has a local minimum at ro if d2f (ro) > 0 for all dr such that |dr|||0. The proof of this theorem is omitted. However, an analogy can be made with the one-variable case. By the Taylor theorem (Theorem  94. MAXIMUM AND MINIMUM VALUES 261 13.15), values of a function f in a sufficiently small neighborhood of a point ro are well approximated as f (r) = f (ro) + df (ro) + }d2f (ro), where dr = r - ro. The first two terms define a linearization L(r) f(ro) + df(ro) (or tangent plane) of f at ro. Therefore, f(r) - L(r) = jd2f(ro), where the contributions of terms smaller than ||dr||2 have been ne- glected according to Theorem 13.15. This equation shows that if d2f(ro) < 0 for all ||dr| / 0 (as a function of independent variables dr), then the values of f are strictly less than the values of its lin- earization in a neighborhood 0 < ||r - roll < b if b> 0 is small enough. The continuity of the second partial derivatives in a neighborhood of ro ensures that, for all r near ro and all dr, d2f (r) is a continuous func- tion of two independent variables r and dr. Therefore, if d2f(ro) < 0, Idr|| / 0, then d2f(r) <0 for all r near ro. The values of f along any smooth curve through ro have a local maximum at ro if, in addition, df(ro) = 0. In the two-variable case, one can say that the graph of f is concave downward near ro; it looks like a paraboloid concave downward (see Figure 13.12, middle panel). The function has a local maximum. Similarly, if d2f (ro) > 0 for all ||dr| / 0, then the graph of f lies above its tangent planes for all 0 < ||r - roll < b. The function has a local minimum at ro. The graph of f near ro looks like a paraboloid concave upward (see Figure 13.12, left panel). 94.3. Second-Derivative Test. The differential d2f (ro) is a homogeneous quadratic polynomial in the variables dr. Its sign is determined by its coefficients, which are the second-order partial derivatives of f at ro. The case of functions of two variables is discussed first. Suppose that a function f(x, y) has continuous second derivatives in an open ball centered at ro. The second derivatives a = f" (ro), b = f'(ro), and c = f"(ro)= f'(ro) (Clairaut's theorem) can be arranged into a 2 x 2 symmetric matrix whose diagonal elements are a and b and whose off-diagonal elements c. The quadratic polynomial of a variable A, P2(A) = deta - A b A (a-A)(b-A)-c2, is called the characteristic polynomial of the matrix of second partial derivatives of f at ro. THEOREM 13.19. (Second-Derivative Test). Let ro be a critical point of a function f. Suppose that the second- order partial derivatives of f are continuous in an open disk containing ro. Let P2( A) be the characteristic polynomial of the matrix of second derivatives at ro. Let Ab, i =1, 2, be the roots of P2( A). Then  262 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (1) If the roots are strictly positive, Ai > 0, then f has a local minimum at ro. (2) If the roots are strictly negative, Ai < 0, then f has a local maximum at ro. (3) If the roots do not vanish but have different signs, then f has neither a local maximum nor a local minimum at ro. (4) If at least one of the roots vanishes, then f may have a local maximum, a local minimum, or none of the above (the second- derivative test is inconclusive). In case (3), the critical point is said to be a saddle point of f and the graph of f crosses its tangent plane at ro. PROOF. Consider a rotation (dx, dy) = (dx' cos # - dy' sin #, dy' cos # + dx' sin #). Following the proof of Theorem 11.8 (classification of quadric cylin- ders), the second-order differential is written in the new variables (dx', dy') as d2 f (ro) = a(dx)2 + 2c dx dy + b(dy)2 = a'(dx')2 + 2c'dx'dy' + b'(dy')2, a' = 2 (a + b + (a - b) cos(2#) + 2c sin(2#)I, b' = j (a + b - (a - b) cos(2#) - 2csin(2#)), 2c' = 2c cos(2#) - (a - b) sin(2#). The rotation angle is chosen so that c' = 0. Put A2 = (a - b)2 + 4c2. If cos(2#) = (a - b)/A and sin(2#) = 2c/A, then c' = 0. With this choice, a j(a + b + A), b=j(a+b-A). Next note that a' + b' = a + b and a'b' =4((a + b)2 - A2) = ab - c2. On the other hand, the roots of the quadratic equation P2(A) = 0 also satisfy the same conditions A1 + A2= a + b and A1A2= ab - c2. Thus, a' A1, b' =A2, and d2f (ro) =-Ai(dx')2 + A2(dy')2. If A1 and A2 are strictly positive, then d2f(ro) > 0 for all (dx, dy) / (0, 0), and by Theorem 13.18 the function has a local minimum at ro. If A1 and A2 are strictly negative, then d2f(ro) > 0 for all (dx, dy) 4 (0, 0) and by Theorem 13.18 the function has a local maximum at ro. If A1 and A2 do not vanish but have opposite signs, A1A < 0, then in a neighborhood of ro, the graph of f looks like z= f(ro) + Ai(x' - x'o)2+ A2(y' - ),  94. MAXIMUM AND MINIMUM VALUES 263 where the coordinates (z', y') are obtained from (x, y) by rotation through an angle #. When A1 and A2 have different signs, this surface is a hy- perbolic paraboloid (a saddle), and f has neither a local minimum nor a local maximum. Case (4) is easily proved by examples (see Study Problem 13.17). D COROLLARY 13.4. Let the function f satisfy the hypotheses of The- orem 13.19. Put D = ab - c2, where a = f"'(ro), b = f'(ro), and c = f"(ro). Then (1) If D > 0 and a > 0 (b > 0), then f(ro) is a local minimum. (2) If D > 0 and a < 0 (b < 0), then f(ro) is a local maximum. (3) If D <0, then f (ro) is not a local extremum. This corollary is a simple consequence of the second-derivative test. Note that A1A2= D. So D < 0 if A1A2 < 0 or ro is a saddle point. Similarly, the conditions (1) and (2) are equivalent to the cases when A1 and A2 are strictly positive or negative, respectively. EXAMPLE 13.43. Find all critical points of the function f(x, y) 13 - + 2 _ X2 _ y2 and determine whether f has a local maximum, minimum, or saddle at them. SoLUTIoN: Critical Points. The function is a polynomial, and therefore it has continuous partial derivatives everywhere of any order. So its critical points are solutions of the system of equations (f' = x2+ y2 - 2x = 0 f'=2xy -2y = 0' It is important not to lose solutions when transforming the system of equations Vf(r) = 0 for the critical points. It follows from the sec- ond equation that y = 0 or x = 2. Therefore, the original system of equations is equivalent to two systems of equations: {f'X=cX2+y2-2x = 0 f'or{f= 2+y2-2x =0 x=1 or y=0 Solutions of the first system are (1, 1) and (1, -1). Solutions of the second system are (0, 0) and (2, 0). Thus, the function has four critical points. Second-Derivative Test. The second derivatives are f"$ = 2x - 2, f" = 2x -2, f"$ = 2y. For the points (1,+1), a =b =0 and c =+2. The characteristic polynomial is P2(A) =A2 - 4. Its roots A =+2 do not vanish and have opposite signs. Therefore, the function has a saddle at the points  264 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (1, +1). For the point (0, 0), a = b = -2 and c = 0. The characteristic polynomial is P2(A) = (-2 - A)2. It has one root of multiplicity 2, that is, A1 =A2= -2 < 0, and f has a local maximum at (0, 0). Finally, for the point (2, 0), a = b = 2 and c = 0. The characteristic polynomial P2(A) = (2 - A)2 has one root of multiplicity 2, A1 =A2= 2 > 0; that is, the function has a local minimum at (2, 0). D EXAMPLE 13.44. Investigate the function f(x, y) = ex2_y(5-2x+y) for extreme values. SOLUTION: The function is defined on the whole plane, and, as the product of an exponential and a polynomial, it has continuous partial derivatives of any order. So its extreme values, if any, can be investi- gated by the second-derivative test. Critical Points. Using the product rule for derivatives, f'X= ex2_y(2x(5 - 2x + y) - 2)= 0 x(5 - 2x + y) = 1, fex2_Y((-1)(5-2x+y)+1) 0 5 - 2x + y = 1. The substitution of the second equation into the first one yields x = 1. Then it follows from the second equation that y = -2. So the function has just one critical point (1, -2). Second-Derivative Test. Using the product rule for derivatives, f"/ = (f')' = e2_ [2x (2x(5 - 2x + y) - 2) + 2(5 - 2x + y) -4] , f/i = (f')' = ex2_ [(-1) ((-1)(5 - 2x + y) + 1) - 1, f"y = (f')' = ex2_Y[2x((-1)(5 - 2x + y) + 1) + 2]. Therefore, a = f"'(1, -2) = -2e3, b =f,'(1,-2)= -e3, and c = f"'(1, -2) = 2e3. Therefore, D = ab - c2 = -2e6 < 0. By Corollary 13.4, the only critical point is a saddle point. The function has no extreme values. D 94.4. Study Problems. Problem 13.15. Find all critical points of the function f(x, y) sin(I) sin(y) and determine whether they are a local maximum, a local minimum, a saddle point. SOLUTION: The function has continuous partial derivatives of any or- der on the whole plane. So the second-derivative test applies to study critical points.  94. MAXIMUM AND MINIMUM VALUES 265 Critical Points. If n and m are integers, then 7r f' = cos(x) sin(y) = 0 - x = - + n or y = wm, 2 f' = sin(x) cos(y) = 0. If x = 2+wn, then it follows from the second equation that y = 2+wm. If y = 7m, then it follows from the second equation that x =win. Thus, for any pair of integers n and m, the points rnm = (2 + 7Tn, 2 + wrm) and r'm = (7rn, wrm) are critical points of the function. Second-Derivative Test has to be applied to each critical point. The second partial derivatives are f" = -sin(x) sin(y), f = - sin(x) sin(y), f"Y = cos(x) cos(y) For the critical points rnm, one has a = ff' (rnm) = -(-1)n+m, b fy(rmr) =-(-1)n+m = a, and c= fy(rmnm) = 0. The characteristic equation is (a - A)2 = 0 and hence A =A2 = -(1)n+m. If n + m is even, then the roots are negative and f(rmnm) = 1 is a local maximum. If n + m is odd, then the roots are positive and f(rnm) = -1 is a local minimum. For the critical points r'm, one has a= f '(rm) =0, b = f,'y(r'm)= 0, and c= f 'y(r'm) (-1)n+m. The characteristic equation A2 - c2 A2 - 1 = 0 has two roots A =+1 of opposite signs. Thus, r'm are saddle points of f. In fact, the local extrema of this function are also its absolute extrema. D Problem 13.16. Define f (0, 0) = 0 and f (x, y) =x2 + y2 - 2x2 4x6y2 (z4 + y2)2 if (x, y) / (0,0). Show that, for all (x, y), the following inequality holds: 4x4y2 < (x4 + y2)2. Use it and the squeeze principle to conclude that f is continuous. Next, consider a line through (0, 0) and parallel to 6 = (cos y, sin p) and the values of f on it: F,(t ) = f (t cos ,t sin%). Show that F,(0) = 0, F'(0) = 0, and F1(0) = 2 for all 0 < c < 2r. Thus, f has a minimum at (0,0) along any straight line through (0, 0). Show that nevertheless f has no minimum at (0, 0) by studying its value along the parabolic curve (xc, y ) =(t, t2). SOLUTION: One has 0;> (a -b)2 =a2-2ab+b2 and hence 2ab a2+b2 for any numbers a and b. Therefore, 4ab - 2ab +2ab < 2ab + a2 + b2- (a+b)2. By setting a = x4 and b =y2, the said inequality is established.  266 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS The continuity of the last term in f at (0, 0) has to be verified. By the found inequality, 4x62 4x 62 (x4 +X2)2<442< = x2 - 0 as (x,y)- (O,O). Thus, f(x, y) - f (0, 0) = 0 as (x, y) - (0, 0), and f is continu- ous everywhere. If o= +7/2, that is, the line coincides with the x axis, (x, y) = (t, 0), one has Fi(t) = t2, from which it follows that F (0) = F' (0) = 0 and F1(0) = 2. When c / +7/2 so that sin e / 0, one has bt4 F6(t4) = t2 + at +- , 2 4 cos6 (Pcos4 C a = -2cos2 yosinyc, b = -s2 c sn2. sin p smn p A straightforward differentiation shows that F,(0) = F'(0) = 0 and F1(0) = 2 as stated, and Fi(t) has an absolute minimum at t = 0, or f attains an absolute minimum at (0, 0) along any straight line through (0, 0). Nevertheless, the latter does not imply that f has a minimum at (0, 0)! Indeed, along the parabola (x, y) = (t, t2), the function f behaves as f (t, t2) = -t, which attains an absolute maximum at t = 0. Thus, along the parabola, f has a maximum value at the origin and hence cannot have a local minimum there. The problem illustrates the remark given earlier in this section. D Problem 13.17. Suppose that A1= 0 or A2= 0 (or both) in the second-derivative test for a function f. Give examples of f when f has a local maximum, or a local minimum, or its graph looks like a saddle, or none of the above. SOLUTION: Consider the function f(x, y) =x2 + sy4, where s is a number. It has a critical point (0,0) because f'(0,0)= f'(0,0) = 0 and a = f"(0,0) = 2, b= f"(0,0) = 0, and c = f(0,0) = 0. Therefore, P2(A) = -(2 - A)A has the roots A1= 2 and A2= 0. If s > 0, then f (x, y) > 0 for all (x, y) and f has a minimum at (0, 0). Let s =-1. Then the curves cc =+y2 divide the plane into four sectors with the vertex at the critical point (the origin). In the sectors containing the cc axis, f(cc, y) ;> 0, whereas in the sectors containing the y axis, f(cc, y) <; 0. Thus, the graph of f has the shape of a saddle. The function f(cc, y) =-(c2 + sy4) has a maximum at (0, 0) if s > 0.  94. MAXIMUM AND MINIMUM VALUES 267 If s < 0, the graph of f has the shape of a saddle. So, if one of the roots vanishes, then f may have a local maximum or a local minimum, or a saddle. The same conclusion is reached when A1 =A2= 0 by studying the functions f(x, y) =+(x4 + sy4) along the similar lines of arguments. Furthermore, consider the function f(x, y) = zy2. It also has a critical point at the origin, and all its second derivatives vanish at (0, 0), that is, P2(A) = A2 and A1 =A2= 0. The function vanishes along the coordinate axes. So the plane is divided into four sectors (quadrants) in each of which the function has a fixed sign. The function is positive in the first and fourth quadrants (x > 0) and negative in the second and third quadrants (x < 0). It then follows that f has no maximum or minimum, and its graph does not have the shape of a saddle. Next, put f(x, y) =x2- y3. It has a critical point (0, 0) and a = 2, b = 0, and c = 0, that is, A1= 2 and A2= 0. The zeros of f form the curve y =X2/3, which divides the plane into two parts so that f is negative above this curve and f is positive below it. Therefore, the graph of f does not have the shape of a saddle, and f does not have a minimum or maximum at (0, 0). The behavior of the functions zy2 and x2- y3 near their critical point resembles the behavior of a function of one variable near its critical point that is also an inflection point. D 94.5. Exercises. (1) For each of the following functions, find all critical points and determine if they are a relative maximum, a relative minimum, or a saddle point: (i) f (X, y) (ii) f (X, y) (iii) f(x,y) (iv) f (X, y) (v) f (X, y) (vi) f(X, y) (vii) f(x,y) (viii) f(x,y) (ix) f (X, y) (x) f (X, y) (xi) f (X, y) (xii) f (X, y) (xiii) f (X, y) (xiv) f(X, y) (xv) f (X, y) X2 + (y-2)2 X2 - (y - 2)2 (x -y+ 1)2 x2 - zy + y2 - 2x + y 3 _ 2 - X2 - 3x - y + 1 x2y3(6 - x - y) X3 +y 3 - 3xy X4 + y4 -cx2 - 2xy - 2y2 zy + 50/ + 20/y, c > 0, y > 0 X2 + 2 1+ cos(x) cos(y) cos x + y2 y3 + 6xy + 8x3 X3 - 2,/+ y2 y(1 - x - y)  268 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (xvi) f(x,y) = x cos y (xvii) f(x, y) =xy 1 - x2/a2 _ y2/b2 (xviii) f (x, y) = (ax+ by + c)/1 l+ x2 + y2, a2 + b2 + c2-0 (xix) fv(x, y) = (5x + 7y - 25)ex22_,2 (xx) f (x, y) = sin x + sin y + cos(x + y) (xxi) f(x,y) =Kx3+xy2 -x2 _y2 (xxii) f(x, y) = jy_3 + xy + X3 (xxiii) f(x,y) = x2 + xy + y2 - 4 In x- 10 In y (xxiv) f (x, y) x= y ln(x2 + y2) (xxv) f (x, y) = x + y + sin(x) sin(y) (xxvi) f (x, y) = sin(x) + cos(y) + cos(x - y) (xxvii) f (x, y) = x - 2y + ln( z2 + y2) + 3 tan-1(y/x) (2) Let the function z = z(x, y) be defined implicitly by the given equation. Use the implicit differentiation to find extreme values of z(x, y): (i) x2 + y2 + z2 - 2x + 2y - 4z - 10 = 0 (ii) x2 + y2 + z2 - zz - yz + 2x + 2y + 2z - 2 =0 (iii) (X2 + y2 + z2)2 =a2(x2 + y2 - z2) 95. Maximum and Minimum Values (Continued) 95.1. Second-Derivative Test for Multivariable Functions. Theorem 13.18 holds for any number of variables, and there is a multivariable analog of the second-derivative test (Theorem 13.19). As in the two-variable case, the numbers f"(ro) = Dig can be arranged into an m x m ma- trix. By Clairaut's theorem, this matrix is symmetric Di = Dj2. The polynomial of degree m, Dn - A D12 D13 ... Dim D21 D22 - A D23 --. D2m Pm(A) = det D31 D32 D33-A -.-.- D3m (Dmi Dm2 Dm3 -- Dmm - A is called the characteristic polynomial of the matrix of second deriva- tives. The following facts are established by methods of linear algebra: (1) The characteristic polynomial of a real symmetric in x in ma- trix has in real roots A1, A2,..., Am (a root of multiplicity k counted k times).  95. MAXIMUM AND MINIMUM VALUES (CONTINUED) 269 (2) There exists a rotation dr = (dxi, dx2, ..., dxm) - dr' = (dx', dx2, ..., dz'm) , which is a linear homogeneous transformation that preserves the length ||dr| =||dr'll, such that m m m m d2f (ro) Z f" xj(ro)dxdxj Z Dij dxi dx i=1 j=1 i=1 j=1 = A,(dxl)2 + A2(dx')2 +- + An(dx')-2. (3) The roots of the characteristic polynomial satisfy the condi- tions: i+A2+---+ Am =Dri+ D22+-- + Dmm, AiA2 "--Am = det D. Fact (2) implies that if all roots of the characteristic polynomial are strictly positive, then d2f(ro) > 0 for all dr =dr' 0, and hence f(ro) is a local minimum by Theorem 13.18. Similarly, if all the roots are strictly negative, then f(ro) is a local maximum. Corollary 13.4 follows from fact (2) for m = 2. For m > 2, these properties of the roots are insufficient to establish a multivariable analog of Corollary 13.4. Fact (3) also implies that if det D = 0, then one or more roots are 0. Hence, d2f(ro) = 0 for some dr / 0, and the hypotheses of Theorem 13.18 are not fulfilled. THEOREM 13.20. (Second-Derivative Test for m Variables). Let ro be a critical point of f and suppose that f has continuous second- order partial derivatives in some open ball centered at ro. Let Aj, i 1, 2, ..., m, be roots of the characteristic polynomial Pm(A) of the matrix of second derivatives D = f" (ro). (1) If all the roots are strictly positive, A2 > 0, then f has a local minimum. (2) If all the roots are strictly negative, A2 < 0, then f has a local maximum. (3) If all the roots do not vanish but have different signs, then f has no local minimum or maximum at ro. (4) If some of the roots vanish, then f may have a local maxi- mum, or a local minimum, or none of the above (the test is inconclusive). In case (3), the difference f(r) - f(ro) changes its sign in a neigh- borhood of ro. It is an rn-dimensional analog of a saddle point. Case (4) holds if det D =0, which is easy to verify. In general, roots of  270 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS Pm(A) are found numerically. If some of the roots are guessed, then a synthetic division can be used to reduce the order of the equation. If Pm(A) = 0, then there is a polynomial Qm-1 of degree m -1 such that Pm(A) (A - Ai)Qm-1(A) so that the other roots satisfy Qm-i(A) = 0. The signs of the roots can also be established by a graphical method (an example is given Study Problem 13.18). EXAMPLE 13.45. Investigate the function f(x, y, z) =jx3 + y2 + z2 + xy + 2z for extreme values. SOLUTION: The function is a polynomial so it has continuous partial derivatives of any order everywhere. So its critical points satisfy the equations: f'/= x2+ Y = 0 x2=x f'= y+z = 0 y=-x f'=2z+2= z -1 The first equation has two solutions x = 0 and x = 1. So the func- tion has two critical points r1 = (0, 0, -1) and r2 = (1, -1, -1). The second-order partial derivatives are f"x = 2x, f", = f"y = 1, fez = fez = 0, fzz = 2. For the critical point ri, the characteristic polynomial -A 1 0 det 1 1-A 0 =(2-A)(A2 -A -1) 0 0 2-A has the roots 2 and (1+ 5)/2. They do not vanish but have different signs. So r1 is a saddle point of f (no extreme value). For the critical point r2, the characteristic polynomial /2 -A 1 0 det 1 1 - A 0 = (2 - A)(A2 - 3A + 1) 0 0 2- A has positive roots 2 > 0 and (3 + 5)/2 > 0. So f (1, -1, -1) - -7/6 is a local minimum. D 95.2. When the Second-Derivative Test Is Inconclusive. If at least one of the roots of the characteristic polynomials vanishes, the second- derivative test is inconclusive. How can the local behavior of a func- tion be analyzed near its critical point? If the function in question has continuous partial derivative of sufficiently high orders in a neighbor- hood of a critical point ro, then the Taylor theorem provides a useful  95. MAXIMUM AND MINIMUM VALUES (CONTINUED) 271 technique for answering this question. The local behavior of a func- tion near ro is determined by higher-order differentials dof(r), where dr = r - ro. It is generally easier to study the concavity of a polyno- mial than that of a general function. The concept is illustrated by the following example. EXAMPLE 13.46. Investigate a local behavior of the function f (x, y) = sin(xy)/(xy) if x / 0 and y / 0 and f (0, y) = f (x, 0) = 1. SOLUTION: Since u = zy is small near the origin, by the Taylor theo- rem sin u = u - u3/6 + E(u)u3, where E(u) -- 0 as u -a 0. Therefore, f(x, y) = 1- -1(i -6e(u)) = 1 - X2(1 -6e(xy)). In a disk x2 + y2 G 52 of a sufficiently small radius b > 0, 1 - 6c(xy) > 0 because c(xy) - 0 as (x, y) -- (0, 0). So the function attains a local maximum at (0, 0) because x2y2 > 0 in the disk. The inequality sinl <; lul suggests that f attains the maximum value 1 also along the coordinate axes (critical points are not isolated). The established local behavior of the function implies that its second partial derivatives vanish at (0, 0), and hence both roots of the characteristic polynomial P2(A) = A2 vanish (the second-derivative test is inconclusive). Q 95.3. Absolute Maximal and Minimal Values. For a function f of one variable, the extreme value theorem says that if f is continuous on a closed interval [a, b], then f has an absolute minimum value and an absolute maximum value (see Calculus I). For example, the function f (x) = x2 on [-1, 2] attains an absolute minimum value at x = 0 and an absolute maximum value at x = 2. The function is differentiable for all x, and therefore its critical points are determined by f'(x) = 2x = 0. So the absolute minimum value occurs at the critical point x = 0 inside the interval, while the absolute maximum value occurs on the boundary of the interval that is not a critical point of f. Thus, to find the absolute maximum and minimum values of a function f in a closed interval in the domain of f, the values of f must be evaluated and compared not only at the critical points but also at the boundaries of the interval. The situation for multivariable functions is similar. For example, the function f(x, y) x2 + y2 whose arguments are restricted to the square D = [0, 11 x [0, 11 attains its absolute maximum and minimum values on the boundary of D as shown in the left panel of Figure 13.13. DEFINITION 13.27. (Closed Set). A set D in a Euclidean space is said to be closed if it contains all its limit points.  272 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS I x FIGURE 13.13. Left: The graph z = x2 + y2 (a circular paraboloid) over the square D = [0, 1] x [0, 1]. The function f (x, y) = x2+y2 attains its absolute maximum and minimum values on the boundary of D: f(0, 0) < f(x, y) < f(1, 1) for all points in D. Right: The graph z = -xy. The values of f(x, y) = -xy along the circle x2 + y2 = 4 are shown by the curve on the graph. The function has two local maxima and minima on the disk x2 + y2 < 4, while it has no maximum and minimum values on the entire plane. Recall that any neighborhood of a limit point of D contains points of D. If a limit point of D is not an interior point of D, then it lies on a boundary of D. So a closed set contains its boundaries. All points of an open interval (a, b) are its limit points, but, in addition, the boundaries a and b are also its limit points, so when they are added, a closed set [a, b] is obtained. Similarly, the set in the plane D{(x, y) x2 + y2 < 1} has limit points on the circle x2 + y2 = 1 (the boundary of D), which is not in D. By adding these points, a closed set is obtained, x2+ y2<1. DEFINITION 13.28. (Bounded Set). A set D in a Euclidean space is said to be bounded if it is contained in some ball. In other words, for any two points in a bounded set, the distance between them cannot exceed some value (the diameter of the ball that contains the set). THEOREM 13.21. (Extreme Value Theorem). If f is continuous on a closed, bounded set D in a Euclidean space, then f attains an absolute maximum value f(r1) and an absolute minimum value f(r2) at some points r1 E D and r2 E D.  95. MAXIMUM AND MINIMUM VALUES (CONTINUED) 273 The closedness of D is essential. For example, if the function f (x, y) = x2+y2 is restricted to the open square D = (0, 1) x (0, 1), then it has no extreme values on D. For all (x, y) in D, f(0, 0) < f (x, y) < f(1, 1) and there are points in D arbitrarily close to (0, 0) and (1, 1), but the points (0, 0) and (1, 1) are not in D. So f takes values on D arbitrarily close to 0 and 2, never reaching them. The boundedness of D is also crucial. For example, if the function f(x, y) = x2 + y2 is restricted to the first quadrant, x > 0 and y > 0, then f has no maximum value on D. It should be noted that the continuity of f and the closedness and boundedness of D are sufficient (not necessary) conditions for f to attain its absolute extreme values on D. There are noncontinuous or continuous functions on an unbounded or non- closed region D (or both) that attain their extreme values on D. Such examples are given in Calculus I, and functions of a single variable may always be viewed as a particular case of functions of two or more variables: f (x, y) = g(x). By the extreme value theorem, it follows that the points r1 and r2 are either critical points of f (because local extrema always occur at critical points) or lie on the boundary of D. So, to find the absolute minimum and maximum values of a continuous function f on a closed, bounded set D, one has to (1) Find the values of f at the critical points of f in D. (2) Find the extreme values of f on the boundary of D. (3) The largest of the values obtained in steps 1 and 2 is the absolute maximum value, and the smallest of these values is the absolute minimum value. EXAMPLE 13.47. Find the absolute maximum and minimum values of f (x, y) = x2 + y2 + xy on the disk x2 + y2 < 4 and the points at which they occur. SOLUTION: The function f is a polynomial. It has continuous partial derivatives of any order on the whole plane. The disk x2 + y2 G 4 is a closed set in the plane. So the hypotheses of the extreme value theorem are fulfilled. Step 1. Critical points of f satisfy the system of equations f' = 2x+y 0 and f' = 2y + x = 0; that is, (0, 0) is the only critical point of f and it happens to be in the disk. The value of f at the critical point is f (0, 0) =0. Step 2. The boundary of the disk is the circle xc2 + y2 =4. To find the extreme values of f on it, take the parametric equations of the circle xc(t) =2 cos t, y(t) =2 sin t, where t E [0, 27r]. The values of the  274 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS function on the boundary are F(t) = f(x(t), y(t)) = 4 + 4 cost sin t = 4 + 2 sin(2t). The function F(t) attains its maximal value 6 on [0, 27] when sin(2t) = 1 or t =w7/4 and t =w7/4 + w. These values of t corre- spond to the points (/'2, /) and (-/'2, -/'2). Similarly, F(t) attains its minimal value 2 on [0, 2w] when sin(2t) -1 or t = 37/4 and t = 37/4 + w. These values of t correspond to the points (-/-2, /) and (/-2, -/-2). Step 3. The largest number of 0, 2, and 6 is 6. So the absolute maxi- mum value of f is 6; it occurs at the points (/'2, /) and (-/'2, -/'2). The smallest number of 0, 2, and 6 is 0. So the absolute minimum value of f is 0; it occurs at the point (0, 0). D EXAMPLE 13.48. Find the absolute maximum and minimum values of f (x, y, z) = x2+y2-z2+2z on the closed set D = {(x, y, z)| x2+y2 z < 4}. SOLUTION: The set D is the solid bounded from below by the parabo- loid z = x2 + y2 and from the top by the plane z = 4. It is a bounded set, and f has continuous partial derivatives of any order on the whole space. Step 1. Since f is differentiable everywhere, its critical points satisfy the equations f' = 2x = 0, f' = 2y = 0, and f' = -2z + 2 = 0. There is only one critical point (0, 0, 1), and it happens to be in D. The value of f at it is f (0, 0, 1) = 1. Step 2. The boundary consists of two surfaces, the disk Si {(x, y, z) z = 4, x2 + y2 4} in the plane z = 4 and the portion of the paraboloid S2 = {(x, y, z)|z = x2 + y2, x2 + y2 < 4}. The val- ues of f on Si are F1(x, y) = f(x,y,4), where the points (x,y) lie in the disk of radius 2, x2 + y2 < 4. The problem now is to find the maximal and minimal values of a two-variable function F1 on the disk. In principle, at this point, Steps 1, 2, and 3 have to be applied to F1. These technicalities can be avoided in this partic- ular case by noting that F1(x, y) = x2 + y2 - 8 = r2 - 8, where r2 = x2 + y2 < 4. Therefore, the maximal value of F1 is reached when r2 = 4, and its minimal value is reached when r2 = 0. So the maximal and minimal values of f on Si are -4 and -8. The values of f on S2 are F2(x, y) = f(x,y, x2 + y2) = 3r2 - r4 = g(r), where r2 = + y2 <4 or r E [0, 2]. The critical points of g(r) satisfy the equation g'(r) =6r - 4r3 =0 whose solutions are r =0, r =+ /2 . Therefore, the maximal value of f on S2 is 9/4, which is the largest of g(0) =0, g( /2 ) =9/4, and g(2) =-4, and the minimal value is -4 as the smallest of these numbers.  95. MAXIMUM AND MINIMUM VALUES (CONTINUED) 275 Step 3. The absolute maximum value off on D is max{1, -8, -4, 9/4} 9/4, and the absolute minimum value off on D is min{1, -8, -4, 9/4} -8. Both extreme values of f occur on the boundary of D: f(0, 0, 4) -8, and the absolute maximal value is attained along the circle of intersection of the plane z = 3/2 with the paraboloid z = x2 + y2. D 95.4. Study Problems. Problem 13.18. Investigate the extreme values of the function f (x, y, z) = x + y2/(4x) + z2/y + 2/z if x> 0, y > 0, and z > 0. SOLUTION: The critical points are determined by the equations y2 y z2 2z 2 f 0,f - 0 f/Z4x2 2x y2 y z2 The first equation is equivalent to y = 2x (since c > 0 and y > 0). The substitution of this relation into the second equation gives z = y because y > 0 and z > 0. The substitution of this relation into the third equation yields z = 1 as z > 0. There is only one critical point ro = (j, 1,1) in the positive octant. The second partial derivatives at ro are f"f(ro) 2x = 4, f"(ro) 2x2 -2, fz) (ro) ro) = + 2z2 - 3, f z(ro) = -0=,2 f z r ) 1-6 r0 rO The characteristic equation of the second derivative matrix is -2 0 det(-2 3 -A -21 =(4 -A)[(3 -A)(6 -A) -4] -4(6 -A) 0 -2 6--A] - -A3+13A2 -46A+32= 0. First of all, A =0 is not a root. To analyze the signs of the roots, the following method is employed. The characteristic equation is written in the form A(A2- 13A+-46) =32. This equation determines the points of intersection of the graph y g(A) =A(A2-13A+46) with the horizontal line y =32. The polynomial g(A) has one simple root g(0) - 0 because the quadraticquauation A2 - 13A + 46 0 has no real roots. Therefore, g(A) > 0 if A> 0 and g(A) < 0 if A < 0. This implies that the intersection of the horizontal line y =32 > 0 with the graph y =-g(A) occurs only for A > 0. Thus,  276 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS all three roots of the characteristic polynomial P3(A) are positive, and hence f (1/2, 1, 1) = 4 is a minimum. D Problem 13.19. (The Least Squares Method). Suppose that a scientist has a reason to believe that two quantities x and y are related linearly, y = mx + b, where m and b are unknown constants. The scientist performs an experiment and collects data as points on the plane (xi, y2), i = 1, 2, ..., N. Since the data contain errors, the points do not lie on a straight line. Let d - = y -(mzi +b) be the vertical deviation of the point (xi, y2) from the line y = mb+x. The method of least squares determines the constants m and b by demanding that the sum of squares E d2 attains its minimal value, thus providing the "best" fit to the data points. Find m and b. SOLUTION: Consider the function f (m, b) =EN d?. Its critical points satisfy the equations fb = -2 EN N d=0 and f,=-2Li1x~d =0 because (d2)' = -1 and (d)'m = -xi. The substitution of the explicit form of d2 into these equation, yields the following system: N N N N N m x + bN =Zy2 , m of+ b 2= X2y2 i=1 i=1 i=1 i=1 i=1 whose solution determines the slope m and the constant b of the least squares linear fit to the data points. Note that the second-derivative test here is not really necessary to conclude that f has a minimum at the critical point. Explain why! D 95.5. Exercises. (1) For each of the following functions, find all critical points and determine if they are a relative maximum, a relative minimum, or a saddle point: (i) f (x, y, z) = x2 + y2 + z2 + 2x + 4y - 8z (ii) f (x, y, z) = x2 + y-3 + z2+2xy-2z (iii) f (x, y, z) = x2 + y-3 + z2+2xy+2z (iv) f(x,y,z) = sin x + z sin y (v) f(x, y, z) = x2 +y3 +z2 - 2xy - 4zy (vi) f (x, y, z) = x + y2/(4x) + z2/y + 2/z (vii) f(x,y, z) =a2/xz+ x2/y + y2/z + z2/b, z > 0, y > 0, z > 0, (viii) f(x,y,z) sinx+siny+sinz+sin(x+y+z), where (x,y,z) E [0,w7] x [0,w7] x [0,w7] (ix) f (zi, ..., zm) =LG-1 sinxzk (x) f (r) =(R2-||r||2)2, where r =(zi1, ..., zm) and R is a constant  95. MAXIMUM AND MINIMUM VALUES (CONTINUED) 277 (xi) f(zi, ..., zm) = X1 + x2/1i + x3/z2 + ... + Xm/Xm1 + 2/Xm, z2> 0,i= 1,2, ..., m (2) Given two positive numbers a and b, find m numbers xi, i= 1, 2, ..., m, between a and b so that the ratio (a+z)(zi+ X2) -..-(zm + b) is maximal. (3) Use multivariable Taylor polynomials to show that the origin is a critical point of each of the following functions. Determine if there is a local maximum, a local minimum, or none of the above. (i) f(x,y) (ii) f(x,y) (iii) f(x, y) = (iv) f (x, y) (v) f (x, y) (vi) f (x, y) (vii) f (x, y) (viii) f (x, y) (ix) f (x, y, z) (x) f (x, y, z) (4) Let f(x,y, z) x2 + xy2 + y4 ln(1 + x2y2) x2 ln(1 + x2y2) xy(cos(x2y) - 1) (X 2 + 2y2) tan-1(x + y) cos(exY - 1) ln(y2 sin2 x + 1) eX+y2 - 1 - sin(x - y2) = sin(xy + z2)/(xy + z2) = 2 - 2 cos(x + y + z) - x2_ y2 -z2 = xy2z3(a - x - 2y - 3z), a > 0. Find all its critical points and determine if they are a relative maximum, a relative mini- mum, or a saddle point. (5) Give examples of a function f (x, y) of two variables on a region D that attains its extreme values and has the following properties: (i) f is continuous on D, and D is not closed. (ii) f is not continuous on D, and D is bounded and closed. (iii) f is not continuous on D, and D is not bounded and not closed. Do such examples contradict the extreme value theorem? Explain. (6) For each of the following functions, find the extreme values on the specified set D: (i) f (x, y) = 1 + 2x - 3y, D is the closed triangle with vertices (0, 0), (1, 2), and (2, 4) (ii) f(x,y) =cX2 + y2 + zy2 - 1, D = {(x,y)|z cc 1, ly| C2} (iii) f(x, y) = yz2, D = {(x, y)|zx;>c0, y;> 0, x2 + y2 <4} (iv) f(x,y,z) = zy2 + z, D = {(xy,z)| IC 1, ly| C1, Iz| <1} (v) f(x,y,z) = zy2 + z, D = {(x, y, z)|I1l f(ro)) for all r in some neighborhood of ro that satisfy the constraints, that is, ga(r) =0. Note that a function f may not have local maxima or minima in its domain. However, when its arguments become subject to constraints, it may well have local maxima and minima on the set defined by the  280 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS constraints. In the example considered, f(x, y) = -xy has no local maxima or minima, but, when it is restricted on the circle by imposing the constraint g(x, y) =9X2 + y2 - 4 = 0, it happens to have two local minima and maxima. 96.1. Critical Points of a Function Subject to a Constraint. The extreme value problem with constraints amounts to finding the critical points of a function whose arguments are subject to constraints. The example discussed above shows that the equation Vf = 0 no longer determines the critical points for differentiable functions if its arguments are con- strained. Consider first the case of a single constraint for two variables r = (x, y). Let ro be a point at which f(r) has a local extremum on the set S defined by the constraint g(r) = 0. Let us assume that the function g has continuous partial derivatives in a neighborhood of ro and Vg(ro) / 0. Then the equation g(r) = 0 defines a smooth curve through the point ro (recall the argument given before Theorem 13.16). Let r(t) be parametric equations of this curve in a neighborhood of ro, that is, for some t = to, r(to) = ro. The function F(t) = f(r(t)) defines values of f along the curve and has a local extremum at to. Let f be differentiable at ro and Vf(ro) / 0. Since the curve is smooth, the vector function r(t) is differentiable, and it is concluded that F has no rate of change at t = to, F'(to) = 0. The chain rule yields F'(to) = f'(ro)x'(to) + f'(ro)y'(to) = V f (ro) - r'(to) = 0 ->V f (ro) 1 r'(to), provided r'(to) / 0 (the curve is smooth). The gradient Vf(ro) is orthogonal to a tangent vector to the curve at the point where f has a local extremum on the curve. By Theorem 13.16, the gradient Vg(r) at any point is normal to the level curve g(r) = 0, that is, Vg(r(t))lr'(t) for any t, provided Vg(ro) / 0. Therefore, the gradients Vf(ro) and Vg(ro) must be parallel at ro (see Figure 13.14). The characteristic geometrical property of the point ro is that the level curve of f and the curve g(x, y) = 0 intersect at ro and are tangential to one another. For this very reason, f has no rate of change along the curve g(x, y) = 0 at ro. This geometrical statement can be translated into an algebraic one: there should exist a number A such that Vf(ro) = AVg(ro). This proves the following theorem. THEOREM 13.22. (Critical Points Subject to a Constraint). Suppose that f has a local extreme value at a point ro on the set defined by a constraint g(r ) =0. Suppose that g has continuous partial deriva- tives in a neighborhood of ro and V g(ro) / 0. If f is differentiable at  96. LAGRANGE MULTIPLIERS 281 0 V o g V gV - Po P x g(x, y) = 0 g(x, y, z) = 0 FIGURE 13.14. Left: Relative orientations of the gradients Vf and Vg along the curve g(x, y) = 0. At the point Po, the function f has a local extreme value along the curve g = 0. At this point, the gradients are parallel, and the level curve of f through Po and the curve g = 0 have a common tangent line. Right: Relative orientations of the gradients Vf and Vg along any curve in the constraint surface g(x, y, z) = 0. At the point Po, the function f has a local extreme value on the surface g = 0. At this point, the gradients are parallel, and the level surface of f through Po and the surface g = 0 have a common tangent plane. ro, then there exists a number A such that Vf(ro) = AVg(ro). The theorem holds for three-variable functions as well. Indeed, if r(t) is a curve through ro in the level surface g(x, y, z) = 0. Then the derivative F'(t) = (d/dt)f(r(t)) =f'z' + f'y' + f'z' = Vf . r' must vanish at to, that is, F'(to) = Vf(ro) . r'(to) = 0. Therefore, Vf(ro) is orthogonal to a tangent vector of any curve in the surface S at ro. On the other hand, by the properties of the gradient, the vector Vg(ro) is orthogonal to r'(to) for every such curve. Therefore, at the point ro, the gradients of f and g must be parallel. A similar line of reasoning proves the theorem for any number of variables. This theorem provides a powerful method to find critical points of f subject to a constraint g = 0. It is called the method of Lagrange multipliers. To find the critical points of f, the following system of equations must be solved: (13.17) Vf (r) = AVg(r) , g(r) = 0. If r = (x, y), this is a system of three equations, f' = Ag', f' = Ag', and g = 0 for three variables (x, y, A). For each solution (xo, yo, Ao), the corresponding critical point of f is (xo, yo). The numerical value of A is not relevant; only its existence must be established by solving  282 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS the system. In the three-variable case, the system contains four equa- tions for four variables (x, y, z, A). For each solution (x, Yo, zo, Ao), the corresponding critical point of f is (xo, yo, zo). EXAMPLE 13.50. Use the method of Lagrange multipliers to solve the problem in Example 13.33. SOLUTION: Put g(x, y) = x2 + y2 - 4. The functions f (x, y) = -xy and g have continuous partial derivatives everywhere as they are poly- nomials. Then f Ag' r -y = 2Ax f Ag' --> j -x =2Ay . g = 0x2+ y2 4 The substitution of the first equation into the second one gives x 4A2x. This means that either x = 0 or A =+1/2. If x = 0, then y = 0 by the first equation, which contradicts the constraint. For A = 1/2, x = -y and the constraint gives 2x2 = 4 or x = 2. The critical points corresponding to A = 1/2 are (V2, -V2) and (- 2, 2). If A -1/2, c = y and the constraint gives 2x2 = 4 or x = 2. The critical points corresponding to A = -1/2 are (V2, 2) and (-v2, -v2). So f (+2, -F 2) = 2 is the maximal value and f(+2, t 2) = -2 is the minimal one. D EXAMPLE 13.51. A rectangular box without a lid is to be made from cardboard. Find the dimensions of the box of a given volume V such that the cost of material is minimal. SOLUTION: Let the dimensions be x, y, and z, where z is the height. The amount of cardboard needed is determined by the surface area f(x, y, z) =ccy + 2xz + 2yz. The question is to find the minimal value of f subject to constraint g(x, y, z) = zyz - V = 0. The Lagrange multiplier method gives f ' =Ag' y +2z = Ayz zy +2xz = AV f' = Ag' x + 2z = Azz xy + 2zy = AV f' = Ag' 2x + 2y= Azy 2czz+ 2yz = AV g =c0 yz=V zxyz=V where the last system has been obtained by multiplying the first equa- tion by cc, the second one by y, and the third one by z with the sub- sequent use of the constraint. Combining the first two equations, one infers that 2z(y - cc) =0. Since z / 0 (V / 0), one has y =cx. Com- bining the first and third equations, one infers that y(cc - 2z) =0 and hence cc= 2z. The substitution of y =cc 2z into the constraint  96. LAGRANGE MULTIPLIERS 283 yields 4z3 = V. Hence, the optimal dimensions are x= y = (2V)1/3 and z = (2V)1/3/2. The amount of cardboard minimizing the cost is 3(2V)2/3 (the value of f at the critical point). From the geometry of the problem, it is clear that f attains its minimum value at the only critical point. D The method of Lagrange multipliers can be used to determine ex- treme values of a function on a set D. Recall that the extreme values may occur on the boundary of D. In Example 13.47, explicit paramet- ric equations of the boundary of D have been used (Step 2). Instead, an algebraic equation of the boundary, g(x, y) = x2+y2 -4 = 0, can be used in combination with the method of Lagrange multipliers. Indeed, if f (x, y) = x2 + y2 + zy, then its critical points along the boundary circle satisfy the system of equations: f 1g'2x + y = 2Ax f'Ag' - 2y + x = 2Ay . g =0 x2 +y2 4 By subtracting the second equation from the first one, it follows that x - y = 2A(x - y). Hence, either x = y or A = 1/2. In the former case, the constraint yields 2x2 = 4 or x = 2. The corresponding critical points are (+2, tv2). If A = 1/2, then from the first two equations in the system, one infers that x = -y. The constraint becomes 2x2 4 or x =+ t/, and the critical points are (tV/2, -F /2). Remark. The condition Vg(ro) / 0 is crucial for the method of Lagrange multipliers to work. If Vf(ro) / 0 (i.e. ro is not a criti- cal point of f without the constraint), then (13.17) have no solution when Vg(ro) = 0, and the method of Lagrange multiplers fails. Re- call that the derivation of (13.17) requires that a curve defined by the equation g(r) = 0 is smooth near ro, which may no longer be the case if Vg(ro) = 0 (see the implicit function theorem). So, if f at- tains its local extreme value at a point which is a cusp of the curve g(x, y) = 0, then it cannot be determined by (13.17). For example, the equation g(x, y) = x3 - y2 = 0 defines a curve that has a cusp at (0, 0) and Vg(0, 0) = 0, i.e., the curve has no normal vector at the origin. Since cc= y2/3 > 0 on the curve, the function f(x, y) = x attains its absolute minimum value 0 along this curve at the origin. However, Vf(0, 0) =(1, 0) and the method of Lagrange multipliers fails to detect this point because there is no A at which Eqs. (13.17) are satisfied. On the other hand, the function f(cc, y) =cc2 also attains its absolute minimum value at the origin. Equations (13.17) have the  284 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS solution (x, y) = (0, 0) and A = 0. The difference between the two cases is that in the latter case Vf(0, 0) = 0, i.e., (0, 0) is also a critical point of f without the constraint. The method of Lagrange multipliers also becomes inapplicable if g is not differentiable at ro (see the exercises). The behavior of a function f at the points where Vg does not exist or vanishes has to be investigated by different means. 96.2. The Case of Two or More Constraints. Let a function of three variables f have a local extreme value at point ro on the set defined by two constraints gi(r) = 0 and g2(r) = 0. Provided gi and g2 have con- tinuous partial derivatives in a neighborhood of ro and their gradients do not vanish at ro, each constraint defines a surface in the domain of f (level surfaces of gi and g2). So the set defined by the constraints is the curve of intersection of the level surfaces gi = 0 and g2 = 0. Let v be a tangent vector to the curve at ro. Since the curve lies in the level sur- face gi = 0, by the earlier arguments, Vf(ro)lv and Vg1(ro)lv. On the other hand, the curve also lies in the level surface g2 = 0 and hence Vg2(ro)lv. It follows that the gradients Vf, Vgi, and Vg2 become coplanar at the point ro as they lie in the plane normal to v. Suppose that the vectors Vg1(ro) and Vg2(ro) are not parallel or, equivalently, Vg1(ro) is not proportional to Vg2(ro). Then any vector in the plane normal to v is a linear combination of them (see Study Problem 11.6). Therefore, there exist numbers A1 and A2 such that Vf (r) = AVg1(r) + A2Vg2(r) , gi(r) = g2(r) = 0 when r = ro. This is a system of five equations for five variables (xc, y, z, A1, A2). For any solution (zo, Yo, z0, A10, A20), the point (zo, Yo, z0) is a critical point of f on the set defined by the constraints. In general, the following result can be proved by a similar line of reasoning. THEOREM 13.23. (Critical Points Subject to Constraints). Suppose that functions ga, a = 1, 2,..., M, of m variables, m > M, have continuous partial derivatives in a neighborhood of a point ro and a function f has a local extreme value at ro in the set defined by the constraints ga(r) = 0. Suppose that Vga(ro) are nonzero vectors any of which cannot be expressed as a linear combination of the others and f is differentiable at ro. Then there exist numbers Aa such that Vf(ro) =A1Vg1(ro) + A2Vg2(ro) + -.-. + AMVgM(ro). EXAMPLE 13.52. Find extreme values of the functions f(xc, y, z) zcyz on the curve that is an intersection of the sphere x2+ y2 + z2 1 and the plane zc + y + z =0.  96. LAGRANGE MULTIPLIERS 285 SOLUTION: Put gi(x, y, z) = x2+y2+ z2 -1 and g2(x, y, z) = x+y+ z. One has Vgi = (2x, 2y, 2z), which can only vanish at (0, 0, 0), and hence Vgi / 0 on the sphere. Also, Vg2 = (1, 1, 1) / 0. Therefore, critical points of f on the surface of constraints are determined by the equations: f'=_Ag'+ A2g' yz = 2Ax + A2 f'=Aig' + A2g xz = 2Ay + A2 f' = Ag'§+ A2g' - xy=2Az+ .A2 gi = 0 1 = x2+ y2 + z2 g2 = 0 0 = x+y+z Subtract the second equation from the first one to obtain (y - x)z = 2A1(x - y). It follows, then, that either x = y or z = -2A1. Suppose first that y = x. Then z = -2x by the fifth equation. The substitution of x = y and z = -2x into the fourth equation yields 6x2 = 1 or x = +1//36. Therefore, the points r1 = (1/V6/, 1/V6/, -2/V6/) and r2 = (-1/V6/, -1//56, 2//56) are critical points provided there exist the corresponding values A1 and A2 such that all equations are satisfied. For example, take ri. Then the second and third equations become r- g A + A2 f 1 j A1+2 So A1 = -1/(2//6) and A2= -1/6. The existence of A1 and A2 for the point r2 is verified similarly. Next, suppose that z = -2A1. Subtract the third equation from the second one to obtain (z - y)cc= 2A1(y - z). It follows that either y = z or x = -2A1. Let y = z. The fifth equation yields x = -2y, and the fourth equation is reduced to 6y2 =1. There- fore, there are two more critical points: r3 = (-2/6, 1/6, 1/v6) and r4 = (2/V6/, -1/V6/, -1// 6). The reader is to verify the exis- tence of A1 and A2 in these cases (note that A1= -z/2). Finally, let x = -2A1 and z = -2A1. These conditions imply that x = z and, by the fifth equation, y = -2x. The fourth equation yields 6x2 = 1 so that there is another pair of critical points: r5 = (1//6_, -2//-6, 1//-6) and r6 = (-1//_6, 2//_6, -1/V6/) (the reader is to verify the existence of A1 and A2). The intersection of the sphere and the plane is a circle. So it is sufficient to compare values of f at the critical points to determine the extreme values of f. It follows that f attains the maximum value 2/ 69 at r2, r4, and r6 and the minimum value -2/ /6 at r1, r3, and r5.  286 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS Let f(r) be a function subject to a constraint g(r). Define the function F(r, A) f (r) - Ag(r), where A is viewed as an additional independent variable. Then crit- ical points of F are determined by (13.17). Indeed, the condition 3F/BA = 0 yields the constraint g(r) = 0, while the differentiation with respect to the variables r gives VF = Vf - AVg = 0, which co- incides with the first equation in (13.17). Similarly, if there are several constraints, critical points of the function with additional variables Aa, a = 1, 2, ...,M, M (13.18) F(r, A1, A2, ..., An) = f (r) - Aaga (r) a=1 coincide with the critical points of f subject to the constraints ga = 0 as stated in Theorem 13.23. The functions F and f have the same values on the set defined by the constraints ga = 0 because they differ by a linear combination of constraint functions with the coefficients being the Lagrange multipliers. The above observation provides a simple way to formulate the equations for critical points subject to constraints. 96.3. Finding Local Maxima and Minima. In the simplest case of a func- tion f of two variables subject to a constraint, the nature of critical points (local maximum or minimum) can be determined by geometrical means. Suppose that the level curve g(x, y) = 0 is closed. Then, by the extreme value theorem, f attains its maximum and minimum values on it at some of the critical points. Suppose f attains its absolute maxi- mum at a critical point r1. Then f should have either a local minimum or an inflection at the neighboring critical point r2 along the curve. Let r3 be the critical point next to r2 along the curve. Then f has a local minimum at r2 if f (r2) < f (r3) and an inflection if f (r2) > f(r3). This procedure may be continued until all critical points are exhausted. Compare this pattern of critical points with the behavior of a height along a closed hiking path. Remark. If the constraints can be solved, then an explicit form of f on the set defined by the constraints can be found, and the stan- dard second-derivative test applies! For instance, in Example 13.51, the constraint can be solved z =V/(xy). The values of the func- tion f on the constraint surface are F(x, y) =f~x, y, V/(xy)) =czy+ 2V~x + y)/(xy). The equations F' 0 and F' = 0 determine the critical point x y = (2V)1/3 (and z =V/(xy) =(2V)1/3/2). So  96. LAGRANGE MULTIPLIERS 287 the second-derivative test can be applied to the function F(x, y) at the critical point x = y = (2V)1/3 to show that indeed F has a minimum and hence f has a minimum on the constraint surface. There is an analog of the second-derivative test for critical points of functions subject to constraints. Its general formulation is not simple. So the discussion is limited to the simplest case of a function of two variables subject to a constraint. Suppose that g has continuous partial derivatives in a neighborhood of ro and Vg(ro) / 0. Then g' and g' cannot simultaneously vanish at the critical point. Without loss of generality, assume that g' 0 at ro = (zo, yo). By the implicit function theorem, there is a neighborhood of ro in which the equation g(x, y) = 0 has a unique solution y = h(x). The values of f on the level curve g = 0 near the critical point are F(x) = f(x, h(x)). By the chain rule, one infers that F' = f' + f'h' and (13.19) F" = (d/dx)(f' + f'h') = f"' + 2fjyh' + f"(h')2 + f'h". So, in order to find F"(xo), one has to calculate h'(xo) and h"(xo). This task is accomplished by the implicit differentiation. By the definition of h(x), G(x) = g(x, h(x)) = 0 for all x in an open interval containing xo. Therefore, G'(x) = 0, which defines h' because G' = g' + g'h' = 0 and h' = -g'/g'. Similarly, G"(x) = 0 yields (13.20) G" = g"X + 2g",h' + g",(h')2 + g'h" = 0, which can be solved for h", where h' = -g'g'. The substitution of h'(xo), h"(xo), and all the values of all the partial derivatives off at the critical point (xo, Yo) into (13.19) gives the value F"(xo). If F"(xo) > 0 (or F"(xo) < 0), then f has a local minimum (or maximum) at (xo, yo) along the curve g = 0. Note also that F'(xo) = 0 as required owing to the conditions f' Ag' and f' = Ag' satisfied at the critical point. If g'(ro) = 0, then g' (ro) / 0, and there is a function x = h(y) that solves the equation g(x, y) = 0. So, by swapping x and y in the above arguments, the same conclusion is proved to hold. EXAMPLE 13.53. Show that the point ro = (0, 0) is a critical point of the function f (x, y) = x2y+y+x subject to the constraint ex'= x+y+1 and determine whether f has a local minimum or maximum at it. SOLUTION: Critical Point. Put g(x, y) = exy - x - y - 1. Then g(0, 0) =0; that is, the point (0, 0) satisfies the constraint. The first partial derivatives of f and g are f'=2xy+1, f' =9+1, g' yexY -1, and g'= er" 1. Therefore, both equations f'(0, 0) =Ag' (0, 0) and f'(0, 0) =Ag' (0, 0)  288 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS are satisfied at A = -1. Thus, the point (0, 0) is a critical point of f subject to the constraint g = 0. Second-Derivative Test. Since g'(0, 0) = -1 - 0, there is a function y = h(x) near x = 0 such that G(x) = g(x, h(x)) = 0. By the implicit differentiation, h'(0) -g' (0, 0) /g' (0, 0) -1. The second partial derivatives of g are g"= y2 eXY, g"=x2eXY, g exY + xye x. The derivative h"(0) is found from (13.20), where g"X(0, 0)= g"(0, 0) = 0, g" (0, 0) = 1, h'(0) -1, and g' (0, 0) -1: h"(0) -[g"X (0, 0) + 2g" (0, 0)h'(0) + g",(0, 0) (h'(0))2]/g' (0, 0) -2. The second partial derivatives of f are f~z=2y , f" = 0 ,f", = 2x. The substitution of f"x (0, 0) = f"y(0, 0) = f"y(0, 0) = 0, h'(0) -1, f'(0,0) = 1, and h"(0) = -2 into (13.19) gives F"(0) = -2 < 0. Therefore, f attains a local maximum at (0, 0) along the curve g 0. Note also that F'(0)= f'(0,0) + f'(0,0)h'(0) = 1 - 1 = 0 as required. D The implicit differentiation and the implicit function theorem can be used to establish the second-derivative test for the multivariable case with constraints (see another example in Study Problem 13.21). 96.4. Study Problems. Problem 13.20. An axially symmetric solid consists of a circular cylinder and a right-angled circular cone attached to one of the cylin- der's bases. What are the dimensions of the solid at which it has a maximal volume if the surface area of the solid has a fixed value S? SOLUTION: Let r and h be the radius and height of the cylinder. Since the cone is right-angled, its height is r. The surface area is the sum of three terms: the area of the base (disk) wr2, the area of the side of the cylinder 2wrh, and the surface area Sc of the cone. A cone with an angle a at the vertex is obtained by rotation of a straight line y = mx, where rn= tan(a/2), about the z axis. In the present case, a =w7/2 and mr= 1. If a is the height of the cone, then the surface area of the cone is (see Calculus II) Sc = 2xydx f 2wx dzxr  96. LAGRANGE MULTIPLIERS 289 because a = r in the present case. Similarly, the volume of the cone is /a r 73 Vc = -y2 fd= x2dx =7 0oio 3 Therefore, the problem is reduced to finding the maximal value of the function (volume) V(r, h) =wxr2h + r3/3 subject to the constraint 2wrh + 2wr2 = S. Put g(r, h) = 2wrh + 2wr2 - S. Then critical points of V satisfy the equations: V'= Ag, 2wrh+ wr2h = A(2wh + 47r) V'=Ag->7rr2= 27Ar 0 = g(r, h) y = rh + r2 Since r / 0 (the third equation is not satisfied if r = 0), the second equation implies that A = r/2. The substitution of the latter into the first equation yields rh = r2 or h = r. Then it follows from the third equation that the sought-after dimensions are h = r = S/(47r). D Problem 13.21. Let functions f and g of three variables r = (x, y, z) have continuous partial derivatives up to order 2. Use the implicit differentiation to establish the second-derivative test for critical points off on the surface g = 0. SOLUTION: Suppose that Vg(ro) / 0 at a critical point r0. With- out loss of generality, one can assume that g' (ro) / 0. By the im- plicit function theorem, there exists a function z = h(x, y) such that G(x, y) = g(x, y, h (x, y)) = 0 in some neighborhood of the critical point. Then the equations G' (x, y) = 0 and G' (x, y) = 0 determine the first partial derivatives of h: g+ g'h> = 0 h'=-g'/g'; g' + g'h'=0 -- h' -g'/g'§. The second partial derivatives h'' h'', and h'' are found from the equations G'X = 0 =- g"X + 2g"zh' + gz(h') 2 + g' h'= 0, G'= 0 --> g + 2g",h' + gz(h,)2 + g'h'' 0, G'XY = 0 -> g", + g"z h' + g" hY + gzh' h' + g' h''Y = 0. The values of the function f(xc, y, z) of the level surface g(xc, y, z) =0 near the critical points are Fix, y) - fix, y, h~x, y)). To apply the second-derivative test to the function F, its second partial derivatives have to be computed at the critical point. The implicit differentiation  290 290 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS gives F _(f~ + f hI2)' _fix + 2fxz h' ~ + fh1F1 _ f +fh) f / + 2 eh +f h) FY_(f~ + fh)' f'' + f'h + fh) + h'+flx where the first and second partial derivatives of h have been found ear- lier. If (x0, yo, zo) is the critical point found by the Lagrange multiplier method, then a =F"' xYOb =FY'(x0, yo), and c =FfY (x0, yo) in the second-derivative test for the two-variable function F. D 96.5. Exercises. (1) Use Lagrange multipliers to find the maximum and minimum values of each of the following functions subject to the specified constraints: (i f(X, Y) (iii) f (c,y)_ (iv) f(X, Y) (v) f(X, Y) (vi) f(X, Y) (vii) f (x, y)_ (viii) f~iv,y,z) (ix) f~xyz) (X) f (X,Y, Z) (xi) f~r,y,z) (xii) f~iv,y,z) (xiii) f~iv,y,z) z> 0) (xiv) f~iv,y,z) z> 0) (xv) f (X,Y, Z) cy, xC+ y =1 x2 +y2, x/a +y/b =1 xyJ 2, 2x2 + y2 =6 Y2, 2+ 2 xv+ y, x2/16 +y2/9 =1 2x2 - 2y2, xc4 + y4= 32 Act2 + 2Bccy + Cy2, cc2 + 2J.2=1 ccxyz, 3cc2 + 2y2 + z2 =6 =xc-2y+2z, cc2+1y2+z2 =1 =cc2 + y2 + z2,cx2/a2 +y2/b2 +z2/c21 _-x c+3y -3z,cx+ y- z 0, y2 +2z2 =1 ccxy + yz, ccy= 1, y2 + 2z2 =1 =cxy+yz,cx2+212 2, y+z=2 (xc> 0, y >0, =sin(cc)sin(y) sin(z), cc~y~z =w/2 (xc> 0, y>O0, =c2/a2+yJ2/b2+z2/c2, c2+y2+z2 =1, mncc+m2y+ m3z =0, where n =(n21,122,123) is a unit vector (xvi) f (r) = u r, r= R, where r =(ci, ..., ccm,), ui is a constant unit vector, and R is a constant (xvii) f (r) =r "r, n "r= 1, where n has strictly positive components and r =(ci, cc2, ... ccm) (xviii) f (r) _ ±X1 + x c+ Xm'cc+c2 + "'+ccm a, wherenm> 0 and a> 0 (2) Prove the inequality cct +ynh (cc~yVT 2-K2}  96. LAGRANGE MULTIPLIERS 291 if n > 1, x > 0, and y;> 0. Hint: Minimize the function f = (x"+y")/2 under the condition x + y = s. (3) Find the minimal value of the function f(x, y) =y on the curve x2+y4 -y3 = 0. Explain why the method of Lagrange multipliers fails. Hint: Sketch the curve near the origin. (4) Use the method of Lagrange multipliers to maximize the function f(x, y) = 3x + 2y on the curve c + fy = 5. Compare the obtained value with f(0, 25). Explain why the method of Lagrange multipliers fails. (5) Find three positive numbers whose sum is a fixed number c > 0 and whose product is maximal. (6) Use the method of Lagrange multipliers to solve the following ex- ercises from Section 95.5: (i) Exercise 10 (ii) Exercise 11 (iii) Exercise 12 (iv) Exercise 13 (v) Exercise 14 (7) The cross section of a cylindrical tab is a half-disk. If the tab has total area S, what are the dimensions at which the tab has maximal volume? (8) Find a rectangle with a fixed perimeter 2p that forms a solid of the maximal volume under rotation about one of its sides. (9) Find a triangle with a fixed perimeter 2p that forms a solid of the maximal volume under rotation about one of its sides. (10) Find a rectangular box with the maximal volume that is con- tained in a half-ball of radius R. (11) Find a rectangular box with the maximal volume that is con- tained in an ellipsoid x2/a2 + y2/b2 + z2/c2 1. (12) Consider a circular cone obtained by rotation of a straight line segment of length 1 about the axis through an endpoint of the segment. If the angle between the segment and the axis is 0, find a rectangular box within the cone that has a maximal volume. (13) The solid consists of a rectangular box and two identical pyramids whose bases are opposite faces of the box. The edges of the pyramid adjacent at the vertex opposite to its base have equal lengths. If the solid has a fixed volume V, at what angle between the edges of the pyramid and its base is the surface area of the solid minimal? (14) Use Lagrange multipliers to find the distance between the parabola y =cc2 and the line cc - y =2.  292 13. DIFFERENTIATION OF MULTIVARIABLE FUNCTIONS (15) Find the maximum value of the function f(r) = /x12.. Xm given that x1 + x2 + - - - + rm= c, where c is a positive constant. Deduce from the result that if x > 0, i = 1, 2, ..., m, then /X1X2 .X <;-(zI + X2 + - + cm); that is, the geometrical mean of m numbers is no larger than the arith- metic mean. When is the equality reached? (16) Give an alternative proof of the Cauchy-Schwarz inequality in a Euclidean space (Theorem 13.1) using the method of Lagrange multi- pliers to maximize the function of 2m variables f(x, y) = x - y subject to the contraints x - x = 1 and y - y = 1, where x = (xi, ..., Xm) and y = (yi, ..., ym). Hint: After maximizing the function, put x = a/|a and y = b/||b|| for any two nonzero vectors a and b.  CHAPTER 14 Multiple Integrals 97. Double Integrals 97.1. The Volume Problem. Suppose one needs to determine the vol- ume of a hill whose height f(r) as a function of position r = (x, y) is known. For example, the hill must be leveled to construct a highway. Its volume is required to estimate the number of truck loads needed to move the soil away. The following procedure can be used to estimate the volume. The base D of the hill is first partitioned into small pieces DP of area AAP, where p = 1, 2, ..., N enumerates the pieces; that is, the union of all the pieces DP is the region D. The partition elements should be small enough so that the height f(r) has no significant vari- ation when r is in Dp. The volume of the portion of the hill above each partition element DP is approximately AV,~ f(r,) AAP, where r, is a point in D, (see the left panel of Figure 14.1). The approximation becomes better for smaller Dp. The volume of the hill can therefore be estimated as N V Zf(r)AAp. p=1 For practical purposes, the values f(rp) can be found, for example, from a detailed contour map of f. The approximation is expected to become better and better as the size of the partition elements gets smaller (naturally, their number N has to increase). If R, is the smallest radius of a disk that contains Dp, then put RN = max, Rp, which determines the size of the largest par- tition element. When a larger number N of partition elements is taken to improve the accuracy of the approximation, one has to reduce RN at the same time to make variations of f within each partition element smaller. Note that the reduction of the maximal area max AAP versus the maximal size RN may not be good enough to improve the accuracy of the estimate. If D looks like a narrow strip, its area is small, but the variation of the height f along the strip may be significant and the accuracy of the approximation AVj, ~f(r,) AA, is poor. One can 293  294 14. MULTIPLE INTEGRALS 294 14. MULTIPLE INTEGRALS z = f(x,y) Rjk ai Yk-1 rp c ( D AA~ D RD X p a xj-1 xj b FIGURE 14.1. Left: The volume of a solid region bounded from above by the graph z = f(x, y) and from below by a portion D of the xy plane is approximated by the sum of volumes AV = z, AAp of columns with the base area AA, and the height zp = f(rp), where rp is a sample point within the base and p enumerates the columns. Right: A rectangular partition of a region D is obtained by embedding D into a rectangle RD. Then the rectangle RD is partitioned into smaller rectangles Rk3. therefore expect that the exact value of the volume is obtained in the limit N (14.1) V = lim f (rp) AAp. (RN-*O) P-1 The volume V may be viewed as the volume of a solid bounded from above by the surface z = f(x, y), which is the graph of f, and by the portion D of the xy plane. Naturally, it is not expected to depend on the way the region D is partitioned, neither should it depend on the choice of sample points rp in each partition element. The limit (14.1) resembles the limit of a Riemann sum for a single- variable function f(x) on an interval [a, b] used to determine the area under the graph of f. Indeed, if 1k, k = 0, 1, ..., N, xO = a < x1 < < 1N-1 < IN = b is the partition of [a, b], then AA, is the analog of Azx = xj - _j-1, j = 1, 2, ..., N, the number RN is the analog of AN maxi Azx, and the values f(rp) are analogous to f(x)), where xo E [Ky-1, xi]. The area under the graph is then N b A = lim f (xjA) Az = f (x) dx. /A a (ON O) 1  97. DOUBLE INTEGRALS 295 97. DOUBLE INTEGRALS 295 So the limit (14.1) seems to define an integral over a two-dimensional region D (i.e., with respect to both variables x and y used to label points in D). This observation leads to the concept of a double inte- gral. However, the qualitative construction used to analyze the volume problem still lacks the level of rigor used to define the single-variable integration. For example, how does one choose the "shape" of the par- tition elements DP, or how does one calculate their areas? These kinds of questions were not even present in the single-variable case and have to be addressed. 97.2. The Double Integral. Let D be a closed, bounded region. The boundaries of D are assumed to be piecewise-smooth curves. Let f(r) be a bounded function on D, that is, m < f(r) < M for some numbers M and m and all r E D. The numbers m and M are called lower and upper bounds of f on D. Evidently, upper and lower bounds are not unique because any number smaller than m is also a lower bound, and, similarly, any number greater than M is an upper bound. However, the smallest upper bound and the largest lower bound are unique. DEFINITION 14.1. (Supremum and Infimum). Let f be bounded on D. The smallest upper bound of f on D is called the supremum of f on D and denoted by supD f. The largest lower bound of f on D is called the infimum of f on D and denoted by infD f. As a bounded region, D can always be embedded in a rectangle RD = {(x,y) | E [a, b], y E [c, d]} (i.e., D is a subset of RD). The function f is then extended to the rectangle RD by setting its values to 0 for all points outside D, that is, f (r) = 0 if r E RD and r D. Consider a partition x, j = 0, 1, ..., N1, of the interval [a, b], where x = a+j Ax, Ax = (b - a)/N1, and a partition yk, k = 0, 1, ..., N2, of the interval [c, d], where Yk = c + kAy and Ay = (d - c)/N2. These partitions induce a partition of the rectangle RD by rectangles Rjk {(x, y) | IXE [xy _1, y E [yk-1,yk]}, where j = 1, 2, ..., N1 and k = 1, 2, ..., N2. The area of each partition rectangle Rk3 is AA = Ax Ay. This partition is called a rectangular partition of RD. It is depicted in the right panel of Figure 14.1. For every partition rectangle RJk, there are numbers MJk = sup f(r) and myk = inf f(r), the supremum and infimum of f on Ryk. DEFINITION 14.2. (Upper and Lower Sums). Let f be a bounded function on a closed, bounded region D. Let RD be a rectangle that contains D and let the function f be defined to have zero  296 14. MULTIPLE INTEGRALS 296 14. MULTIPLE INTEGRALS value for all points of RD that do not belong to D. Given a rectangular partition Rjk of RD, let MJgf= sup f and mJ= inf f be the supremum and infimum of f on Rjk. The sums N1 N2 N1 N2 U(f, N1, N2)= M3k AA, L(f,N1, N2) = Z mjkAA j=1 k=1 j=1 k=1 are called the upper and lower sums. The upper and lower sums are examples of double sequences. DEFINITION 14.3. (Double Sequence). A double sequence is a rule that assigns a number anm to an ordered pair of integers (n, m), n, m = 1, 2, .... In other words, a double sequence is a function f of two variables (x, y) whose domain consists of points with integer-valued coordinates, anm = f(n, m). Similarly to ordinary numerical sequences, one can define a limit of a double sequence. DEFINITION 14.4. (Limit of a Double Sequence). If, for any positive number E, there exists an integer N such that anm - a| < E for all n, m > N, then the sequence is said to con- verge to a and the number a is called the limit of the sequence and denoted limm,mo anm = a. The limit of a double sequence is analogous to the limit of a func- tion of two variables. A limit of a double sequence can be found by studying the corresponding limit of a function of two variables whose range contains the double sequence. Suppose anm = f(1/n, 1/m) and f(x,y) -- 0 as (x, y) -- (0, 0). The latter means that, for any > 0, there is a number S > 0 such that f(x, y)| < for all ||r|| < S, where r = (x, y). In particular, for r = (1/n, 1/m), the condition Ir2 1/n2 + 1/m2 N > 2/8. Hence, for all such n, mI, anml in it, mytj f (r>t) Myfa. It follows from this inequality that L(f, N1, N2) < R(f, N1, N2) U(f, N1, N2). Since f is integrable, the limits of the  298 14. MULTIPLE INTEGRALS 298 14. MULTIPLE INTEGRALS upper and lower sums exist and coincide. The conclusion of the theorem follows from the squeeze principle for limits. D Approximation of Double Integrals. If f is integrable, its double integral can be approximated by a suitable Riemann sum. A commonly used choice of sample points is to take rjk to be the intersection of the diagonals of partition rectangles Rik, that is, r k (2j, yk), where zy and Yk are the midpoints of the intervals [xz_1, x] and [yk_1, Yk], respectively. This rule is called the midpoint rule. The accuracy of the midpoint rule approximation can be assessed by finding the upper and lower sums; their difference gives the upper bound on the absolute error of the approximation. Alternatively, if the integral is to be evaluated up to some significant decimals, the partition in the Riemann sum has to be refined until its value does not change in the significant digits. The integrability of f guarantees the convergence of Riemann sums and the independence of the limit from the choice of sample points. 97.3. Continuity and Integrability. Not every bounded function is in- tegrable. There are functions whose behavior is so irregular that one cannot give any meaning to the volume under their graph by converging upper and lower sums. An Example of a Nonintegrable Function. Let f be defined on the square x E [0, 1] and y E [0, 1] so that f(x, y) = 1 if both x and y are rational, f(x, y) = 2 if both x and y are irrational, and f(x, y) = 0 otherwise. This function is not integrable. Recall that any interval [a, b] contains both rational and irrational numbers. Therefore, any partition rectan- gle Rik contains points whose coordinates are both rational, or both irrational, or pairs of rational and irrational numbers. Hence, MJ k= 2 and myjk= 0. The lower sum vanishes for any partition and therefore its limit is 0, whereas the upper sum is 2 Elk AA = 2A = 2 for any partition, where A is the area of the square. The limits of the upper and lower sums do not coincide, 2 / 0, and the double integral of f does not exist. The Riemann sum for this function can converge to any number between 2 and 0, depending on the choice of sample points. For example, if the sample points have rational coordinates, then the Riemann sum equals 1. If the sample points have irrational coordi- nates, then the Riemann sum equals 2. If the sample points are such that one coordinate is rational while the other is irrational, then the Riemann sum vanishes.  97. DOUBLE INTEGRALS 299 z y z = f (x, y) FIGURE 14.2. Left: The graph of a piecewise-constant function. The function has a jump discontinuity along a straight line. The volume under the graph is V = MA1 + mA2. Despite the jump discontinuity, the function is inte- grable and the value of the double integral coincides with the volume V. Right: Additivity of the double integral. If a re- gion D is split by a curve into two regions D1 and D2, then the double integral of f over D is the sum of integrals over D1 and D2. The additivity of the double integral is analo- gous to the additivity of the volume: The volume under the graph z = f(x, y) and above D is the sum of volumes above D1 and D2. The following theorem describes a class of integrable functions that is sufficient in many practical applications. THEOREM 14.2. (Integrability of Continuous Functions). Let D be a closed, bounded region whose boundaries are piecewise- smooth curves. If a function f is continuous on D, then it is integrable on D. Note that the converse is not true; that is, the class of integrable functions is wider than the class of all continuous functions. This is a rather natural conclusion in view of the analogy between the double integral and the volume. The volume of a solid below a graph z f(x, y) > 0 of a continuous function on D should exist. On the other hand, let f(x, y) be defined on D = {(x, y) E [0, 2], y E [0, 1]} so that f(x, y) = m if x < 1 and f(x, y) = M if x > 1. The function is  300 14. MULTIPLE INTEGRALS piecewise constant and has a jump discontinuity along the line x= 1 in D. Its graph is shown in the left panel of Figure 14.2. The volume below the graph z = f(x, y) and above D is easy to find; it is the sum of the volumes of two rectangular boxes with the same base area A1= A2 = 1 and different heights M and m, V = MA1+rnA2 = M+m. The double integral of f exists and also equals M +im. Indeed, for any rectangular partition, the numbers MJk and mJk differ only for partition rectangles intersected by the discontinuity line x= 1, that is, MJk - mJk = M - m for all such rectangles. Therefore, the difference between the upper and lower sums is l Ax(M-m), where 1 = 1 is the length of the discontinuity curve. In the limit Ax - 0, the difference vanishes. As noted earlier, the upper and lower sums are the upper and lower estimates of the volume and should therefore converge to it as their limits coincide. Using a similar line of arguments, one can prove the following. COROLLARY 14.1. Let D be a closed, bounded region whose bound- aries are piecewise-smooth curves. If a function f is bounded on D and discontinuous only on a finite number of smooth curves, then it is integrable on D. 97.4. Exercises. (1) For each of the following functions and the specified rectangular domain D, find the double integral using its definition: (i) f(x,y) k = const, D = {(x,y)a < x < b, c C y C d} (ii) f(x, y) = k1= const if y > 0 and f(x, y) = k2 = const if y 0, D ={(x,y)0 x <1, -1 y 1} (iii) f (x, y) = y, D ={(x, y)|0<5x 5 1, 0 5y 5 1} Hint: 1+2+---+ N= N(N+1)/2. (2) Let D be the rectangle 1 < x < 2, 1 < y < 3. Consider a rectangular partition of D by lines x = 1 + j/N and y = 1 + 2k/N, j, k = 1, 2, ..., N. For the function f (x, y) = x2 + y2, find (i) The lower and upper sums, U and L (ii) The limit of the difference U - L as N - oc (iii) The limit of the sums as N - o0 (3) For each of the following functions, use a Riemann sum with spec- ified N1 and N2 and sample points at lower right corners to estimate the double integral over a given region D: y 4} (ii) f (x, y) =sin(x + y), (N1, N2) =(3, 3), D ={(x, y)|0 xc < 7r, 0 y ;r}  98. PROPERTIES OF THE DOUBLE INTEGRAL 301 (4) Approximate the integral of f (x, y) = (24 + x2 + y2)-1/2 over the disk x2 + y2 < 25 by a Riemann sum. Use a partition by squares whose vertices have integer-valued coordinates and sample points at vertices of the squares that are farthest from the origin. (5) Evaluate each of the following double integrals by first identifying it as the volume of a solid: (i) ffD k dA if D is the disk x2 + y2 < 1 and k is a constant (ii) ffD 1-x2 -y2 dA if D is the disk x2+ y2 <1 (iii) ffD (1- x - y) dA if D is the triangle with vertices (0, 0), (0, 1), and (1,0) (iv) ffD(k - z) dA if D is the rectangle 0 cx < k and 0 y a (v) ffD(2- xz2 + y2) dA if D is the part of the disk x2+y2 < 1 in the first quadrant (Hint: The volume of a circular solid cone with the base being the disk of radius R and the height h is 7R2h/3.) (6) Let I be the integral of sin(x+y) over the disk x2+y2 < 1. Suppose that the integration region is partitioned by rectangles of area AA. If R is a Riemann sum, find AA such that II - R| <0.001 for any choice of sample points. 98. Properties of the Double Integral The properties of the double integral are similar to those of an ordinary integral and can be established directly from the definition. Linearity. Let f and g be functions integrable on D and let c be a number. Then If(f+g)dA= ( fdA+ffgdA, Ii cfcdA cfff dA. Area. The double integral (14.2) A(D) =J dA J D is called the area of D (if it exists). If D is bounded by piecewise-smooth curves, then it exists because the unit function f = 1 is continuous on D. By the geometrical interpret ation of the double integral, the number A(D) is the volume of the solid cylinder with the cross section D and the unit height (f =1). Intuitively, the region D can always be covered by the union of adjacent rectangles of area AA =Ax Ay. In the limit (Ax, Ay) - (0, 0), the total area of these rectangles converges to the area of D.  302 14. MULTIPLE INTEGRALS 302 14. MULTIPLE INTEGRALS Additivity. Suppose that D is the union of D1 and D2 such that the area of their intersection is 0; that is, D1 and D2 may only have common points at their boundaries or no common points at all. If f is integrable on D, then fifdA= i fdA+ff fdA. This property is difficult to prove directly from the definition. How- ever, it appears rather natural when making the analogy of the double integral and the volume. If the region D is cut into two pieces D1 and D2, then the solid above D is also cut into two solids, one above D1 and the other above D2. Naturally, the volume is additive (see the right panel of Figure 14.2). Suppose that f is nonnegative on D1 and nonpositive on D2. The double integral over D1 is the volume of the solid above D1 and below the graph of f. Since -f > 0 on D2, the double integral over D2 is the negative volume of the solid below D2 and above the graph of f. When f becomes negative, its graph goes below the plane z = 0 (the xy plane). So the double integral is the difference of the volumes above and below the xy plane. Therefore, it may vanish or take negative values, depending on which volume is larger. This property is analogous to the familiar relation between the ordinary integral and the area under the graph. It is illustrated in Figure 14.3 (left panel). Positivity. If f(r) > 0 for all r E D, then fffdA 0, and, as a consequence of the linearity, fffdA ffgdA if f(r) > g(r) for all r E D. Upper and Lower Bounds. Let m = infD f and M = supD f. Then m < f(r) < M for all r E D. From the positivity property for the double integrals of f(x, y) - m > 0 and M - f(x, y) > 0 over D and (14.2), it follows that mA(D) fd AM A(D). This inequality is easy to visualize. If f is positive, then the double integral is the volume of the solid below the graph of f. The solid  98. PROPERTIES OF THE DOUBLE INTEGRAL 303 zz= M z = f(x,y) X X FIGURE 14.3. Left: A function f is nonnegative on the region D1 and nonpositive on D2. The double integral of f (x, y) over the union of regions D1 and D2 is the difference of the indicated volumes. The volume below the xy plane and above the graph of f contributes to the double integral with the negative sign. Right: An illustration to the upper and lower bounds of the double integral of a function f over a region D. If A(D) is the area of D and m < f(x, y) < M in D, then the volume under the graph of f is no less than the volume mA(D) and no larger than MA(D). lies in the cylinder with cross section D. The graph of f lies between the planes z = m and z = M. Therefore, the volume of the cylinder of height m cannot exceed the volume of the solid, whereas the latter cannot exceed the volume of the cylinder of height M as shown in the right panel of Figure 14.3. THEOREM 14.3. (Integral Mean Value Theorem). If f is continuous on D, then there exists a point ro E D such that 1 f dA = f(ro)A(D). PROOF. Let h be a number. Put g(h) = ffD(f - h) dA = ffD f dA - hA(D). From the upper and lower bounds for the double integral, it follows that g(M) < 0 and g(m) > 0. Since g(h) is linear in h, there exists h = ho E [m, M] such that g(ho) = 0. On the other hand, a continuous function on a closed, bounded region D takes its maximal and minimal values as well as all the values between them (Extreme Value Theorem 12.21). Therefore, for any m < ho < M, there is ro E D such that f(ro) = ho. A geometrical interpretation of the integral mean value theorem is rather simple. Imagine that the solid below the graph of f is made of  304 14. MULTIPLE INTEGRALS 304 14. MULTIPLE INTEGRALS f(x,y) ho z 1 / FIGURE 14.4. Left: A clay solid with a nonflat top (the graph of a continuous function f) may be deformed to the solid of the same volume and with the same horizontal cross section D, but with a flat top ho. The function f takes the value ho at some point of D. This illustrates the inte- gral mean value theorem. Middle: A partition of a disk by concentric circles of radii r = rp and rays 0 = ek as de- scribed in Example 14.1. A partition element is the region rp_1 r < r, and Ok_1 Ok. Right: The volume below the graph z = x2 + y2 and above the disk D, x2 + y2 < 1. The corresponding double integral is evaluated in Example 14.1 by taking the limit of Riemann sums for the partition of D shown in the middle panel. clay (see the left panel of Figure 14.4). The shape of a piece of clay may be deformed while the volume is preserved under deformation. The nonflat top of the solid can be deformed so that it becomes flat, turning the solid into a cylinder of height ho, which, by volume preservation, should be between the smallest and the largest heights of the original solid. The integral mean value theorem merely states the existence of such an average height at which the volume of the cylinder coincides with the volume of the solid with a nonflat top. The continuity of the function is sufficient (but not necessary) to establish that there is a point at which the average height coincides with the value of the function. DEFINITION 14.7. (Average Value of a Function). Let f be integrable on D and let A(D) be the area of D. The average value of f on D is the integral: 1 A(D) fdA. AfD)D  98. PROPERTIES OF THE DOUBLE INTEGRAL 305 If f is continuous on D, then the integral mean value theorem asserts that f attains its average value at some point in D. The con- tinuity hypothesis is crucial here. For example, the function depicted in the left panel of Figure 14.2 is discontinuous. Its average value is (MA1 + mA2)/(A1 + A2), which generally does not coincide with either M or m. Integrability of the Absolute Value. Suppose that f is integrable on a bounded, closed region D. Then its absolute value |f l is also integrable and fffdA |fffdA. A proof of the integrability of |fl is rather technical. Once the inte- grability of |f l is established, the inequality is a simple consequence of a + b| < |a|+|b| applied to a Riemann sum of f. Making the analogy between the double integral and the volume, suppose that f > 0 on D1 and f < 0 on D2, where D1,2 are two portions of D. If V1 and V2 stand for the volumes of the solids bounded by the graph of f and D1 and D2, respectively, then the double integral of f over D is V1 - V2, while the double integral of |f l is V1 + V2. Naturally, V1 - V2| VI + V2 for positive V1,2. The converse is not true. The integrability of the absolute value |f l does not generally imply the integrability of f. The reader is advised to consider the function f(x, y) = 1 if x and y are rational and f(x, y) = -1 otherwise where (x, y) span the rectangle [0, 1] x [0, 1]. Note that |f (x, y)|I= 1 is integrable. Independence of Partition. It has been argued that the volume of a solid under the graph of f and above a region D can be computed by (14.1) in which the Riemann sum is defined for an arbitrary (nonrectangular) partition of D. Can the double integral of f over D be computed in the same way? The analysis is limited to the case when f is continuous. DEFINITION 14.8. (Uniform Continuity). Let f be a function on a region D in a Euclidean space. If, for any number e > 0, there exists a number 65> 0 such that f(r) - f (r')| 0. So the variations of f cannot be bounded by a fixed number e uniformly in any disk of some nonzero radius in D, and f is not uniformly continuous in D. Similarly, take f(x, y) = x2, which is continuous in the unbounded rectangle D = [0, oo) x [0, 1]. Then in a disk whose center is sufficiently far from the line c = 0, the values of f can have variations as large as desired within this disk. For an interval [i, x2] of length S > 0, the variation z2 - z =(6(x2 + zi) can be made as large as desired by taking x2 large enough no matter how small b is. Hence, f is not uniformly continuous in D. Let f be continuous in a closed, bounded region D. Let D be partitioned by piecewise-smooth curves into partition elements Dp, p = 1, 2, ..., N, so that the union of D, is D and A(D) =EN_1 AA,, where A is the area of D, defined by (14.2). If R, is the smallest radius of a disk that contains Dp, put RN = max Rp; that is, R, characterizes the size of the partition element Dp, and RN is the size of the largest partition element. Recall that the largest partition element does not necessarily have the largest area. The partition is said to be refined if RN < RNA for N < N'; that is, the size of the largest partition element decreases. Under the aforementioned conditions, the following theorem holds. THEOREM 14.5. (Independence of the Partition). For any choice of sample points r* and any choice of partition elements Dp, D (RN-MO) p l  98. PROPERTIES OF THE DOUBLE INTEGRAL 307 PROOF. As f is continuous on D, there are points rp E D, such that N N fffdA= ff fdA= f(r)A . Dp=1 Dp p=1 The first equality follows from the additivity of the double integral, and the second one holds by the integral mean value theorem. Consider the Riemann sum N R(f, N) Zf(r*) DA, p=1 where r* E DP are sample points. If r* rp, then the Riemann sum does not coincide with the double integral. However, its limit as N - 0o equals the double integral. Indeed, put c , f(r)-f(r,)| and cN max cp, p = 1, 2, ..., N. By Theorem 14.4, f is uniformly continuous on D. For any c> 0, there is S > 0 such that variations of f in any disk of radius 8 in D do not exceed E. Since RN -- 0 as N - 00, RN < 8 for all N larger than some No. Hence, cN < & because any partition element D, is contained in a disk of radius R, 0 p=1 p=1 as N - oo. D A practical significance of this theorem is that the double integral can be approximated by Riemann sums for any convenient partition of the integration region. Note that the region D is no longer required to be embedded in a rectangle and f does not have to be extended outside of D. This property is useful for evaluating double integrals by means of change of variables discussed later in this chapter. It is also useful to simplify calculations of Riemann sums. E XAMPLE 14.1. Find the double integral of f (x, y) = x2 + y2 over the disk D xc2 + y2 1 using the partition of D by concentric circles and rays from the origin.  308 14. MULTIPLE INTEGRALS SOLUTION: Consider circles x2 + y2 = r, where r ,= p Ar, Ar = 1/N, and p = 0, 1, 2, ..., N. If 0 is the polar angle in the plane, then points with a fixed value of 0 form a ray from the origin. Let the disk D be partitioned by circles of radii r, and rays 0 =64 k AOB, A O= 2w/n, k = 1, 2, ..., n. Each partition element lies in the sector of angle AO and is bounded by two circles whose radii differ by Ar (see the middle panel of Figure 14.4). The area of a sector of radius r, is r AO/2. Therefore, the area of a partition element between circles of radii r, and r,+1 is AA= r+1 AO/2 - r AO/2 = (rl+1 - rp) A/20 = (r,±i + rp) Ar AO/2. In the Riemann sum, use the midpoint rule; that is, the sample points are intersections of the circles of radius r ,= (r,+1 + rp)/2 and the rays with angles 0k _ (k+1 + Ok)/2. The values of f at the sample points are f(r) = j, the area elements are AA, = i Ar AO, and the corresponding Riemann sum reads n N N R(f,1N,in) Z= Ar AO = 2w Ar k=1 p=1 p=1 because LI1 AO= 27, the total range of 0 in the disk D. The sum over p is the Riemann sum for the single-variable function g(r) = r3 on the interval r E [0, 1]. In the limit N - 0oc, this sum converges to the integral of g over the interval [0, 1], that is, ff(x2 + y2) dA=2w lim N r 2J rs So, by choosing the partition according to the shape of D, the double Riemann sum has been reduced to a Riemann sum for a single-variable function. D The numerical value of the double integral in this example is the volume of the solid that lies between the paraboloid z = x2 + y2 and the disk D of unit radius. It can also be represented as the volume of the cylinder with height h = 1/2, V = hA(D) = wh w= /2. This observation illustrates the integral mean value theorem. The function f takes the value h = 1/2 on the circle x2+ y2 = 1/2 of radius 1/v2 in D. 98.1. Exercises. (1) Evaluate each of the following double integrals by using the prop- erties of the double integral and its interpretation as the volume of a solid:  98. PROPERTIES OF THE DOUBLE INTEGRAL 309 (i) ffD k dA, where k is a constant and D is the square -2 < x 2, -2 y 2 with a circular hole of radius 1 (i.e., x2+y2 ; 1 in D) (ii) ffD fdA, where D is a disk x2 + y2 < 4 and f is a piecewise- constant function: f (x, y) = 2 if 1 x2+y2 < 4 and f (x, y) -3 if 0 0 and y> 0 (ii) ffD(ax2_by2) dA < (a+b)7/2, where D is the disk x2+y2 1 (Hint: Put r2 = x2 + y2. Then use x2 < r2 and y2 < r2 and apply the result of Example 14.1.) (3) Find the lower and upper bounds for each of the following integrals: (i) ffDccy3 dA, where D is the square 1< c x2, 1 < y < 2 (ii) ffD 1+cce-U dA, where D is the square0, y) on the interval y E [c, d]. So,  99. ITERATED INTEGRALS 311 if the functions ga(y) are integrable on [c, d], then the limit of their Riemann sums is the integral of g3 over the interval. If f is continuous on D, then it must also be continuous along the lines x x= z in D; that is, ga(y) = f (x, y) is continuous and hence integrable on [c, d]. Thus, N2 d (14.4) lim fj(xd, yk) y= f(x, y)dy Define a function A(x) by (14.5) A(x) f(x, y)dy. The value of A at x is given by the integral of f with respect to y; the integration with respect to y is carried out as if x were a fixed number. For example, put f (x, y) = x2y + exy and [c, d] = [0, 1]. Then an antiderivative F(x, y) of f(x, y) with respect to y is F(x, y) x2y2/2 + exY/x, which means that F'(x, y) = f (x, y). Therefore, F1 1 A(x) =](x2y + exY) dy = x2y2/2 + e"/x_ x2/2 + ex/x - 1/x. /10 0 A geometrical interpretation of A(x) is simple. If f > 0, then A(xe) is the area of the cross section of the solid below the graph z = f(x, y) by the plane x x=>z, and A(x) Ax is the volume of the slice of the solid of width Ax (see the right panel of Figure 14.5). The second sum in the Riemann sum for the double integral in the Riemann sum of A(x) on the interval [a, b]: J ffdA = lim ZA(z) Az fA(x)dx JD Ni a jb(fdf (x Y)dy)\ dx, where the integral exists by the continuity of A. The integral on the right side of this equality is called the iterated integral. In what follows, the parentheses in the iterated integral will be omitted. The order in which the integrals are evaluated is specified by the order of the differentials in it; for example, dy dx means that the integration with respect to y is to be carried out first. In a similar fashion, by computing the limit Ax -a 0 first, the double integral can be expressed as an iterated integral in which the integration is carried out with respect to x and then with respect to y. So the following result has been established.  312 14. MULTIPLE INTEGRALS z = f(x,) y) ~xY A y) A(x) FIGURE 14.5. An illustration to Fubini's theorem. The volume of a solid below the graph z = f(x, y) and above a rectangle R is the sum of the volumes of the slices. Left: The slicing is done parallel to the x axis so that the volume of each slice is Ay A(y), where A(y) is the area of the cross section by a plane with a fixed value of y. Right: The slicing is done parallel to the y axis so that the volume of each slice is Ax A(x), where A(x) is the area of the cross section by a plane with a fixed value of x as given in (14.5). THEOREM 14.7. (Fubini's Theorem). If f is continuous on the rectangle D = (a, b] x (c, d], then fff , g) dA fd ff(xy )dxdy fb ff(xy ) dydx. Think of a loaf of bread with a rectangular base and with a top having the shape of the graph z = f(x, y). It can be sliced along either of the two directions parallel to adjacent sides of its base. Fubini's theorem says that the volume of the loaf is the sum of the volumes of the slices and is independent of how the slicing is done. EXAMPLE 14.2. Find the volume of the solid bounded from above by the portion of the paraboloid z = 4 - x2 _ 2y2 and from below by the portion of the paraboloid z = -4 + x2 + 2y2, where (x, y) E (0, 1] x L0, 1]. SOLUTION: If the height of the solid at any (x, y) E D is h(x, y) ztop(x, y) - zbot (x, y), where the graphs z = ztop(x, y) and z = zbot (x, y)  99. ITERATED INTEGRALS 313 99. ITERATED INTEGRALS 313 are the top and bottom boundaries of the solid, then the volume is V = h(xy)dA Jj[ ff(8 -2x2 _42) dA zt0p(x, y) - zbot(X, y)] dA i (8 - 2x2 d4y2)dy I f[(8 2x2)y - 4y3/3] dcv 0 I8 - 2x2 - 4/3)dx = 6. D- COROLLARY 14.2. (Factorization of Iterated Integrals). Let D be a rectangle [a, b] x [c, d]. Suppose f(x, y) = g(x)h(y), where the functions g and h are integrable on [a, b] and [c, d], respectively. Then fff(xy)dA fbgQx) dcfh(y) dy. So the double integral becomes the product of two ordinary integrals in this case. This simple consequence of Fubini's theorem is quite useful. EXAMPLE 14.3. Evaluate the double integral of f (x, y) = sin(x + y) over the rectangle [0, r] x [-7/2, r/2]. SOLUTION: One has sin(x + y) = sin x cos y + cos x sin y. The integral of sin y over [-7/2, r/2] vanishes by symmetry. So, by the factorization property of the iterated integral, only the first term contributes to the double integral: ffsin(x+y)dA /7r 7r/2 sin xdx cos y dy =4 . 0 -7r1/2 D- The following example illustrates the use of the additivity of a dou- ble integral. EXAMPLE 14.4. Evaluate the double integral of f(x, y) = 15x4y2 over the region D, which is the rectangle [-2, 2] x [-2, 2] with the rect- angular hole [-1, 1] x [-1, 1]. SOLUTION: Let Di = [-2, 2] x [-2, 2] and let D2 = [-1,1] x [-1, 1]. The rectangle D1 is the union of D and D2 such that their intersection  314 14. MULTIPLE INTEGRALS 314 14. MULTIPLE INTEGRALS has no area. Hence, /i fdA fffdA= By evaluating the double =(fdA+ fffdA fdA - fff dA. integrals over D1,2, 15x4y2 dA = 151 x4 dccf y2 dy = 210, // 15x4y2dA= 15f x4 dzcf y2dy 4. the double integral over D is obtained, 1024 - 4 = 1020..D 99.2. Study Problem. Problem 14.1. Suppose a function f has continuous second deriva- tives on the rectangle D = [0, 1] x [0, 1]. Find ff j, dA if f (0, 0) = 1, f (0, 1) = 2, f (1, 0) = 3, and f (1, 1) = 5. SOLUTION: By Fubini's theorem, ID f" dA 1i 1& I~f Ix f(cy) dccdy Jo Jn &ccX f'(x, y) ody j [f (1, y) - [f(,y) f'(0, y)]idy f(0, y)] dy 1 =[f (1,y) - f(0,y)] 0 =If (1, 1)- f (0 1)] - [f (1, 0) - f (0 0)] = 1. By Clairaut's theorem, f"Y = f",, and the value of the integral is independent of the order of integration. D 99.3. Exercises. (1) Evaluate the following double integrals over specified rectangular regions: (i) ffD(x+y)dA, D = [0, 1] x [0, 2] (ii) ffDccy2dA, D = [0, 1] x [-1, 1] (iii) ffD c+2ydA, D[1, 2] x [0, 1] (iv) ffD(1 + 3x2y) dA, D = [0, 1] x [0, 2] (v) ffD eyxdA, D = [0, 1] x [0, 1] (vi) ffD cos(x + 2y) dA, D = [0, r] x [0, 7/4]  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 315 (vii) ffD dA, D =[0,1] x [0,1] (viii) ffD 2 2dA, D = [0, 1] x [1,2] (ix) ffD(x -y)T"dA, D = [0, 1] x [0, 1], where n is a positive integer (x) ffDex'y + exdA, D = [0, 1] x [0, 2] (xi) ffD sin2(x) sin2(y) dA, D = [0, r] x [0, 7] (xii) ffDln(x+y)dA, D = [1, 2] x [1,2] (xiii) ffD 2 0, as a vertically simple region.  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 317 y y - to p ( )d --. ----------....... ........ D =xbot (p) D r;= top (y) Y Ybot (X ) c .......................... a x b x x *x FIGURE 14.7. Left: An algebraic description of a verti- cally simple region as given in (14.6): for every x E [a, b], the y coordinate ranges over the interval Ybot(x) Y Ytop(x). Right: An algebraic description of a horizontally simple re- gion D as given in (14.7): for every y E [c, d], the x coordinate ranges over the interval Xbot(Y) x < Xtop(Y). SOLUTION: The x coordinate of any point in the disk lies in the interval [a, b] = [-1, 1] (see Figure 14.8, left panel). Take a vertical line corre- sponding to a fixed value of x in this interval. This line intersects the half-disk along the segment whose one endpoint lies on the x axis; that is, y = 0 = Ybot(x). The other endpoint lies on the circle. Solving the equation of the circle for y, one finds y = 1 - x2. Since y > 0 in the half-disk, the positive solution has to be taken, y 1 - x2 = ytop(x). So the region is bounded by two graphs y = 0 and y = 1 - x2. For every -1 x<1, 0 0 and consider the solid bounded from above by the graph z = f(x, y) and from below by the region D. The area of the cross section of the solid by the coordinate plane corresponding to a fixed value of x is is given by (14.5): /d ytop (x) A(x) = f (x, y) dy - f(x, y)dy. C Ybot (x) So just like in the case of rectangular domains, the above limit equals A(x). That the area is given by an integral over a single interval is only possible for a vertically simple base D of the solid. If D were not vertically simple, then such a slice would not have been a single slice but rather a few disjoint slices, depending on how many disjoint intervals are in the intersection of a vertical line with D. In this case, the integration with respect to y would have yielded a sum of integrals over all such intervals. The reason the integration with respect to y is to be carried out first only for vertically simple regions is exactly to avoid the necessity to integrate over a union of disjoint intervals. Finally, the value of the double integral is given by the integral of A(x) over the interval [a, b]. Recall that the volume of a slice of width dx and cross section area A(x) is dV = A(x) dx so that the total volume of the solid is given by the integral V =fL A(x) dx (as the sum of volumes of all slices in the solid). Iterated Integral for Vertically Simple Regions. Let D be a vertically simple region; that is, it admits the algebraic description (14.6). The double integral of f over D is then given by the iterated integral (14.8) fj f(x, y) dA j f (x, y) dy dx.  320 14. MULTIPLE INTEGRALS 320 14. MULTIPLE INTEGRALS 11 1 D y 11 Y=::1 y- D D -1 x 1 x FIGURE 14.9. Illustration to Example 14.8. Left: The in- tegration region as a vertically simple region: -1 < x < 1 and, for every such x, x2 < y < 1. Right: The integration region as a horizontally simple region: 0 < y < 1 and, for every such y, - - < c z y-. Iterated Integral for Horizontally Simple Regions. Naturally, for horizon- tally simple regions, the integration with respect to x should be carried out first. Therefore, the limit Ax -- 0 should be taken first in the Rie- mann sum. The technicalities are similar to the case of vertically simple regions. Let D be a horizontally simple region; that is, it admits the algebraic description (14.7). The double integral of f over D is then given by the iterated integral // /d jtp(y) (14.9) f(x, y) dA = jffd(x, y) dy. JD c bot (Y) Iterated Integrals for Nonsimple Regions. If the integration region D is not simple, how can one evaluate the double integral? Any nonsimple region can be cut by suitable smooth curves into simple regions Dp, p = 1, 2, ..., n. The double integral over simple regions can then be evaluated. The double integral over D is then the sum of the double integrals over Dp by the additivity property. Sometimes, it is also convenient to cut the integration region into two or more pieces even if the region is simple (see Example 14.8). EXAMPLE 14.7. Evaluate the double integral of f(x, y) = 6yx2 over the region D bounded by the line y = 1 and the parabola y = x2. SOLUTION: The region D is both horizontally and vertically simple. It is therefore possible to use either (14.8) or (14.9). To find an algebraic description of D as a vertically simple region, one has to first specify  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 321 the maximal range of the x coordinate in D. It is determined by the intersection of the line y = 1 and the parabola y = x2, that is, 1 = x2, and hence x E [a, b] = [-1, 1] for all points of D (see the left panel of Figure 14.9). For any x E [-1, 1], the y coordinate of points of D attains the smallest value on the parabola (i.e., ybot(x) = x2), and the largest value on the line (i.e., ytop(x) = 1). One has ff6Yz2d6A = 6 x2f y dy dx= 3 x2(1 - x4) dx 8/7. It is also instructive to obtain this result using the reverse order of inte- gration. To find an algebraic description of D as a horizontally simple region, one has to first specify the maximal range of the y coordinate in D. The smallest value of y is 0 and the largest value is 1; that is, y E [c, d] = [0, 1] for all points of D. For any fixed y E [0, 1], the x coordinate of points of D attains the smallest and largest values on the parabola y = x2 or x = + y, that is, Xbot(Y) = - y and Xtop(y) Y (see the right panel of Figure 14.9). One has ffo6yz2 dA 6fy J x2 dc dy 2 y(2y/2)d = 4f y5/2dy = 8/7. 0 100.3. Reversing the Order of Integration. By reversing the order of in- tegration, a simplification of technicalities involved in evaluating double integrals can be achieved, but not always, though. EXAMPLE 14.8. Evaluate the double integral of f(x, y) = 2x over the region D bounded by the line zc= 2y+2 and the parabola z = y2 -1. SoLUTIoN: The region D is both vertically and horizontally simple. However, the iterated integral based on the algebraic description of D as a vertically simple region is more involved. Indeed, the largest value of the c coordinate in D occurs at one of the points of intersection of the line and the parabola, 2y + 2 = 2- 1 or (y - 1)2 =4, and hence y =-1, 3. The largest value of cc in D is cc= 32 - 1 =8. The smallest value of cc occurs at the point of intersection of the parabola with the cc axis, cc - -1. So [a, b] - [-1, 81. For any fixed cc E [-1, 01, the range of the y coordinate is determined by the parabola cc = -y 1.  322 14. MULTIPLE INTEGRALS y y Y ...3............. y =x/2-1 x Y2 -1 x=2y+2 -1 8 y=- x+1 FIGURE 14.10. Illustration to Example 14.8. Left: The integration region D as a vertically simple region. An alge- braic description requires to spliting the maximal range of x into two intervals. For every -1 < x < 0, the y coordinate ranges over the interval - + 1 y < x + 1, whereas for every 0 < x < 8, x/2 - 1 < y < x + 1. Accordingly, when converting the double integral to the iterated integral, the region D has to be split into two parts in which x < 0 and x > 0. Right: The integration region D as a horizontally simple region. For every -1 < y < 3, the x coordinate ranges over the interval y2 - 1 < x < 2y + 2. So the double integral can be converted to a single iterated integral. Solutions of this equation are y = x + 1, and the range of the y coordinate is - x +1 < y < x 1+1. For any fixed x c [0, 8], the largest value of y still occurs on the parabola, y = x + 1, while the smallest value occurs on the line, x = 2y + 2 or y = (x - 2)/2, so that - x + 1 < y < (x - 2)/2. The boundaries of D are ~~f-x+ 1 if -1<1<0 Y = Ytop(x) 1z + 1, Y = Ybot (x) 0 f f(x) dz= , which is quite useful. For example, an indefinite integral of sin(x2011) cannot be expressed in elementary functions. Nevertheless, to find its definite integral over any symmetric interval [-a, a], an explicit form of the indefinite integral is not necessary. Indeed, the function sin(x2011) is antisymmetric, and hence its integral over any symmetric interval vanishes. A similar property can be established for double integrals. Consider a transformation that maps each point (x, y) of the plane to another point (zc, ys). A region D is said to be symmetric un- der a transformation (x, y) - (zc, Ys) if the image Ds of D coincides with D (i.e., Ds = D). For example, let D be bounded by an ellipse x2/a2 + y2/b2 = 1. Then D is symmetric under reflections about the x axis, the y axis, or their combination, that is, (x, y) - (zc, Ys) (-x,y), (x,y) - (zc,ys) =_(x,-y), or (x,y) - (zc,ys) =(-x, -y). A transformation of the plane (x, y) - (zc, Ys) is said to be area pre- serving if the image Ds of any region D under this transformation has the same area, that is, A(D) = A(Ds). For example, transla- tions, rotations, reflections about lines, and their combinations are area-preserving transformations. THEOREM 14.8. (Symmetry Property). Let a region D be symmetric under an area-preserving transformation (x, y) - (zs, ys) such that f (zc, ys) = -f (x, y). Then the integral of f over D vanishes: fff(x,y)dA -0. A proof is postponed until the change of variables in double inte- grals is discussed. Here the simplest case of a reflection about a line is considered. If D is symmetric under this reflection, then the line cuts D into two equal-area regions D1 and D2 so that Dl = D2 and D = D1. The double integral is independent of the choice of partition (see (14.3)). Consider a partition of D1 by elements Dip, p = 1, 2, ..., N. By symmetry, the images Dlp of the partition elements D1, form a par- tition of D2 such that AA = A(Di1) A(Di,) by area preservation. Choose elements D1, and D to partition the region D as shown in the left panel of Figure 14.11. Now recall that the double integral is also independent of the choice of sample points. Suppose (xcv, y,) are sample points in Dip. Choose sample points in D to be the images (xcv, yps)  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 325 z =f(x, y) SV1 V2. FIGURE 14.11. Left: The region D is symmetric relative to the reflection about the line. Under this reflection, D1 D2 and D2 - D1. Any partition of D1 by elements D1, induces the partition of D2 by taking the images of D1, under the reflection. Right: The graph of a function f that is skew- symmetric under the reflection. If f is positive in D2, then it is negative in D1. The volume V2 of the solid below the graph and above D2 is exactly the same as the volume Vi = V2 of the solid above the graph and below D1. But the latter solid lies below the xy plane, and hence the double integral over DisV2-V1=0. of (xc, yp) under the reflection. With these choices of the partition of D and sample points, the Riemann sum (14.3) vanishes: N f dA = lim (f(xp,,yp) A A+ f(x,,yp8) Ap) = 0, ffDN-*ow where the two terms in the sum correspond to partitions of D1 and D2 in D; by the hypothesis, the function f is antisymmetric under the reflection and therefore f(x,, yps) =_-f(x, yp) for all p. From a geometrical point of view, the portion of the solid bounded by the graph z = f(x, y) that lies above the xy plane has exactly the same shape as that below the xy plane, and therefore their volumes contribute with opposite signs to the double integral and cancel each other (see the right panel of Figure 14.11). EXAMPLE 14.10. Evaluate the double integral of sin[(x - y)3] over the portion D of the disk x2 + y2 < 1 that lies in the first quadrant (x, y> 0). SOLUTION: The region D is symmetric under the reflection about the line y = x (see the left panel of Figure 14.12), that is, (xc, y) - (zc8, ys) =(y, cc), whereas the function is antisymmetric, f (zc, ys)= f(y, cc) =sin[(y - cc)3] =sin[-(cc - y)3] =- sin[(c - y)3] -cy) By the symmetry property, the double integral vanishes. D  326 14. MULTIPLE INTEGRALS 326 14. MULTIPLE INTEGRALS Y 3 (x, y) y - --2 12 >z -3 FIGURE 14.12. Left: Illustration to Example 14.8. The region is symmetric under the reflection about the line y= x. Right: The integration region D in Example 14.9. It can be viewed as the difference of the elliptic region D1 and the square D2. The elliptic region is symmetric under the reflection about the x axis, whereas the function f(x, y) 2y3 is skew-symmetric, f (x, -y) =-f (x, y). So the integral over D1 must vanish, and the double integral over D is the negative of the integral over D2. EXAMPLE 14.11. Evaluate the double integral of f (x, y) =x2y3 over the region D, which is obtained from the elliptic region x2/4+ 2/ g < 1 by removing the square [0, 11 x [0, 11. SOLUTION: Let D1 and D2 be the elliptic and square regions, respec- tively. The elliptic region D1 is large enough to include the square D2 as shown in the right panel of Figurre 14.12. Therefore, the additivity of the double integral can be used (compare Example 14.4) to trans- form the double integral over a nonsimple region D into two double integrals over simple regions: //fx2Y3dA ff x2 y3dA ffx2Y3dA lD2 li the integral over D1 vanishes because the elliptic region D1 is symmetric under the reflection (x, y) - (z8, ys) =(x, -y), whereas the integrand is antisymmetric, f~x, -y) =x2(_y)33 -x2y3 _fv )  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 327 100.5. Study Problems. Problem 14.2. Prove the Dirichlet formula /na x a a 0 0 0 y. SOLUTION: The left side of the equation is an iterated integral for the double integral ffD fdA. Let us find the shape of D. According to the limits of integration, D admits the following algebraic description (as a vertically simple region). For every 0 < x < a, the y coordinate changes in the interval 0 < y < c. So the region D is the triangle bounded by the lines y = 0, y = x, and c = a. To reverse the order of integration, let us find an algebraic description of D as a horizontally simple region. The maximal range of y in D is the interval [0, a]. For every fixed 0 < y < a, the c coordinate spans the interval y < x < a in D. So the two sides of the Dirichlet formula represent the same double integral as iterated integrals in different orders and hence are equal. D Problem 14.3. Reverse the order of integration 2 j/x-z2 f(x, y) dy d. 12-x SOLUTION: The given iterated integral represents a double integral ffD fdA, where the integration region admits the following description (as a vertically simple region). For every fixed 1 < x < 2, the y coordinates span the interval 2 - x < y < 2x - x2. So D is bounded by the graphs y = 2 - x (a line) and y = /2x - x2 or y2 = 2x - x2 or, after completing the squares, (x - 1)2 + y2 = 1 (a circle of radius 1 centered at (1, 0)). The circle and the line intersect at the points (1, 1) and (2, 0). Thus, the region D is the part of the disk (x - 1)2 + y2 < 1 that lies above the line y = 2 - c. The reader is advised to sketch it. To reverse the order of integration, let us find an algebraic description of D as a horizontally simple region. The maximal range of y is the interval [0, 1], which is determined by the points of intersection of the circle and the line. Viewing the region D along the c axis, one can see that, for every fixed 0 < y 1, the smallest value of cc in D is attained on the line y =2 - cc or cc= 2 - y =ccbot(y), while its greatest value in D is attained on the circle (cc- 1)2 +y2 =1 or cc-i1 + 1 - y2 or cc 1 1+ 1 - y2 - cctop(y) because the solution with the plus sign corresponds to the part of the circle that lies above the line. Hence,  328 14. MULTIPLE INTEGRALS 328 14. MULTIPLE INTEGRALS the integral in the reversed order reads 1 1+ 1 -y f (x ,y ) dx d y . 0 2-y 100.6. Exercises. (1) For each of the two orders of integration, specify the limits in the iterated integrals for ffD f(x, y) dA, splitting the integration region when necessary, if (i) D is the triangle with vertices (0, 0), (2, 1), and (-2, 1) (ii) D is a the trapezoid with vertices (0, 0), (1, 0), (1, 2), and (0, 1) (iii) D is the disk x2 + y2 < 1 (iv) D is the disk x2 + y2 < y (v) D is the ring 1 < X2+ y2 < 4 (2) Evaluate the following double integrals over the specified region: (i) ffD xy dA, where D is bounded by the curves y = x2 and y=x (ii) ffD(2 + y) dA, where D is the region bounded by the graphs of x = 3 and x = 4 - y2 (iii) ffD(x+ y) dA, where D is bounded by the curves x = y4 and x=y (iv) ffD(2 + y) dA, where D is the region bounded by the three lines of x = 3, y + x = 0, and y - x = 0; find the value of the integral by geometric means (v) ffD x2y dA, where D is the region bounded by the graphs of y =2 + x2 and y =4 - x2 (vi) ffD 1 - y2 dA, where D is the triangle with vertices (0, 0), (0, 1), and (1, 1) (vii) ffDxy dA, where D is bounded by the lines y = 1, x= -3y, and x= 2y (viii) ffD y xz2 - y2dA, where D is the triangle with vertices (0, 0), (0, 1), and (1, 1) (ix) ffD(2a-x-1/2 dA, where D is bounded by the coordinate axes and by the shortest are of the circle of radius a and centered at (a, a) (x) lID IzyldA, where D is the disk of radius a centered at the origin (xi) IID(2 + y2)dA, where D is the parallelogram with the sides y= c, y=c x+a, y= a, and y= 3a (a> 0)  100. DOUBLE INTEGRALS OVER GENERAL REGIONS 329 (xii) ffD y2dA, where D is bounded by the c axis and by one are of the cycloid c = a(t - sin t), y =a(1 - cost), 0 < t < 27 (3) Sketch the solid region whose volume is given by the following integrals: (i) I flX(x2 + y2) dy d (ii) ffD(x + y) dA, where D is defined by the inequalities 0 < x+y 1, c>0, and y >0 (iii) ffD cc2 + y2dA, where D is defined by the inequality x2+y2 < (iv) IID(x2+y2)dA, where D is defined by the inequality Icc+|y < 1 (v) ffD 1 - (42)2 - (y/3)2dA, where D is defined by the in- equality (4/2)2 + (y/3)2 < 1 (4) Use the double integral to find the volume of the specified solid region E: (i) E is bounded by the plane c + y + z = 1 and the coordinate planes. (ii) E lies under the paraboloid z = 2x2 + y2 and above the region in the zy plane bounded by the curves x = y2 and c = 1. (iii) E is bounded by the cylinder x2+y2 = 1 and the planes y = z, c = 0, and z = 0 in the first octant. (iv) E is bounded by the cylinders x2 + y2 = a2 and y2 + z2 = a2. (v) E is enclosed by the parabolic cylinders y= 1 - x2, y = x2 - 1 and the planes x + y + z = 2, 2x + 2y - z = 10. (5) Sketch the region of integration and reverse the order of integration in each of the following iterated integrals. Evaluate the integral if the integrand is specified: (i) f f f(x,y)dcdy (ii) f f cos(x2) dc dy Hint: After reversing the integration or- der, make the substitution = x2 to do the integral. (iii) jj fX 3 f (x, y) dy dc (iv) 2 f f(x,y)dcdy (v) ffgXf (cx,y)dy dc (vi) 1If(ccy)ddy (viii) 1(1 + ys)-1 dc (ix) 126 I(2/4 ~c )d c  330 14. MULTIPLE INTEGRALS (x) ifY f(x, y) dy dx (xi) f2a f2ax2f (x, y) dy dx (a > 0) (xii) f2 fs f(x, y) dy dx (6) Use the symmetry and the properties of the double integral to find: (i) ffD ex2 sin(y3) dA, where D is the triangle with vertices (0, 1), (0,-1), and (1,0) (ii) ffD(y9-px9) dA, where p =+1 and D = {(x, y)|1 Il+|y < 2} (iii) ffD x dA, where D is bounded by the ellipse x2/a2 + y2/b2 1 and has the triangular hole with vertices (0, b), (0, -b), and (a, 0) (iv) ffD(cos(x2) + sin(y2)) dA, where D is the disk x2 + y2 < a2 (7) Find the area of the following regions: (i) D is bounded by the curves xy = a2 and x + y = 5a/2, a > 0. (ii) D is bounded by the curves y2 = 2px+p2 and y2 = -2qx+q2, where p and q are positive numbers. (iii) D is bounded by (x - y)2 + x2 = a2. 101. Double Integrals in Polar Coordinates The polar coordinates are defined by the following relations: x = r cosOB, y = r sinO, or r =2 + y2, O8= tan-1(y/x), where r is the distance from the origin to the point (x, y) and 0 is the angle between the positive x axis and the ray from the origin through the point (x, y) counted counterclockwise. The value of tan-1 must be taken according to the geometrical definition of 0. If (x, y) lies in the first quadrant, then the value of tan-1 must be in the interval [0, 7/2) and tan-1(oo) =_ /2 and similarly for the other quadrants. These equations define a one-to-one correspondence between all points (x, y) / (0, 0) of the plane and points of the strip (r, 0) E (0, oc) x [0, 27). The pairs (r, 0) = (0, 0) correspond to the origin (x, y) = (0, 0). Alternatively, one can also set the range of 0 to be the interval [-7, w). The ordered pair (r, 0) can be viewed as a point of an auxiliary plane or polar plane. In what follows, the r axis in this plane is set to be vertical, and the 0 axis is set to be horizontal. The relations x =r cos 0, y =r sin 0 define a transformation of any region D' in the polar plane to a region D in the zy plane; that is, to every ordered pair (r, 0) corresponding to a point of D', an ordered pair (x, y) corresponding to a point of D is assigned. Accordingly, the inverse transformation r = x/2 + y2, 0 tan-1(y/x) maps a region  101. DOUBLE INTEGRALS IN POLAR COORDINATES 331 pY A e = 0'+1 D' r- rj+1 e 0k+1 FIGURE 14.13. Left: A partition of D' by the coordinate lines r = rj and 0 Ok, where rj+1 - r= Ar and 0k+1 - Ok AO. A partition element is a rectangle Dk. Its area is AA' = ArAO. Right: A partition of D by the images of the coordinate curves r = rj (concentric circles) and 0 Ok (rays extended from the origin). A patrition element Djk is the image of the rectangle D'k. Its area is AAjk - 1(r +1 r )AO = j(rj+1 + rj)AA'k. D in the xy plane to a region D' in the polar plane. The boundaries of D' are mapped onto the boundaries of D by x = r cos 0 and y = r sin 0. For example, let D be the portion of the disk x2 + y2 < 1 in the first quadrant. Then the shape of D' can be found from the images of boundaries of D in the polar plane: boundaries of D e boundaries of D' x2 2r=1 y = 0, x>c0 = 0 x =0, y ;> 0 7 0 /2 Since r > 0, the region D' is the rectangle (r, 0) E [0, 1] x [0, 7/2] = D'. The bounadry of D' always contains r = 0 if the origin belongs to D. If D is invariant under rotations about the origin (a disk or a ring), then 0 takes its full range [0, 27] in D'. Let D' be a region in the polar plane and let D be its image in the xy plane. Let R' be a rectangle containing D' so that the image of R' contains D. As before, a function f on D is extended outside D by setting its values to 0. Consider a rectangular partition of R' such that each partition rectangle D' is bounded by the coordinate lines r = rj, r = rj+1 = rj + Ar, 0 = Ok, and 0 = 0k+1 - 8k + AO as shown in Figure 14.13 (left panel). Each partition rectangle has the area AA' = Ar AO. The image of the coordinate line r = rk in the xy  332 14. MULTIPLE INTEGRALS 332 14. MULTIPLE INTEGRALS plane is the circle of radius rk centered at the origin. The image of the coordinate line 0 = k on the xy plane is the ray from the origin that makes the angle 6k with the positive x axis counted counterclockwise. The rays and circles are called coordinate curves of the polar coordi- nate system, that is, the curves along which either the coordinate r or the coordinate 0 remains constant (concentric circles and rays, re- spectively). A rectangular partition of D' induces a partition of D by coordinate curves of the polar coordinates. Each partition element Dk is the image of the rectangle D'k and is bounded by two circles and two rays. Let f(x, y) be an integrable function on D. The double integral of f over D can be computed as the limit of the Riemann sum. According to (14.3), the limit does not depend on either the choice of partition or the sample points. Let AAgk be the area of Dk. The area of the sector of the disk of radius r that has the angle AO is rAOa/2. Therefore, Ajk -5 r1 - '2) AO_ 2(r~±1 + rj) Ar O =5 rii + rj) AA'. In (14.3), put AA = AAJk, r, E Dk being the image of a sample point (r, O*) E D so that f(r) = f(r cos O*, r* sin O*). The limit in (14.3) is understood as the double limit (Ar, AOB) -- (0, 0). Owing to the independence of the limit of the choice of sample points, put r = (ry+i + rg)/2 (the midpoint rule). With this choice, (r±i + rg) Ar/2 = r* Ar. By taking the limit of the Riemann sum (14.3) N N1 N2 lim f(r)Ap = lim f(rcos9*, r* sinB*)r*AA', (N-oo N= 11,2-,00 1k= one obtains the double integral of the function f(r cos 0, r sin 0)J(r) over the region D' (the image of D), where J(r) = r is called the Ja- cobian of the polar coordinates. The Jacobian defines the area element transformation dA = J dA' = r dA'. DEFINITION 14.10. (Double Integral in Polar Coordinates). Let D be the image of D' in the polar plane spanned by ordered pairs (r, 0) of polar coordinates. The double integral of f over D in polar coordinates is fff(, y)dA = fff(rcoso, rsino6)J(r) dA', J(r) =r . D D  101. DOUBLE INTEGRALS IN POLAR COORDINATES 333 In particular, the area of a region D is given by the double integral A(D) = dA = r dA' in the polar coordinates. A similarity between the double integral in rectangular and polar coordinates is that they both use partitions by corresponding coordinate curves. Note that horizontal and vertical lines are coordinate curves of the rectangular coordinates. So the very term "a double integral in polar coordinates" refers to a specific parti- tioning D in the Riemann sum, namely, by coordinate curves of polar coordinates (by circles and rays). The double integral over D' can be evaluated by the standard means, that is, by converting it to a suitable iterated integral with respect to r and 0. Suppose that D' is a vertically simple region as shown in Figure 14.14 (right panel): D' ={(r,8)|rbet () 0=7/2. So, in the polar plane, the region D' is bounded by the horizontal line r = 2, the graph r = 2 cos 0, and the vertical line 0 = 7/2. It is convenient to use an algebraic description of D' as a vertically simple region; that is, (r,0) E D' if rbot(0) = 2 cos < r< 2 = rtop () and 0 E [0, 7/2] = [01, 02] (because reop(0) = rbot(0)). Second, the func- tion is written in polar coordinates, f (r cos 0, r sin 0) = r2 sin 0 cos 0. Multiplying it by the Jacobian J = r, the integrand is obtained. One  336 14. MULTIPLE INTEGRALS has J/02 rtop(0) xy dA = r sin 0 cos 0 dA' = sin 0cosO / r3drdO D D' o1 rbot(6) /7r/22 sin(9cos(9f r3drdO 0 2cos 6 = 4 (1 - cos O)4 cos sin dO 07/ = 4 (1-u)4udu=4 v4(1-v) d , where two changes of variables have been used to simplify the calcula- tions, u = cosO 6and v= 1 - u. D EXAMPLE 14.14. Find the area of the region D that is bounded by two spirals r = 0 and r = 20, where 0 E [0, 2w], and the positive x axis. Before solving the problem, let us make a few comments about the shape of D. The boundaries r 0= and r = 20 are polar graphs. Given a value of 0, r = 0 (or r = 20) is the distance from the point on the graph to the origin. As this distance increases monotonically with increasing 0, the polar graphs are spirals winding about the origin. The region D lies between two spirals; it is not simple in any direction (see the left panel of Figure 14.16). By converting the polar graph r 0= into the rectangular coordinates, one has x2+ y2 = tan-1(y/x) or y x tan( z2 + y2). There is no way to find an analytic solution of this equation to express y as a function of x or vice versa. Therefore, had one tried to evaluate the double integral in the rectangular coordinates by cutting the region D into simple pieces, one would have faced an unsolvable problem of finding the equations for the boundaries of D in the form y =_Ytop(x) and y =_Ybot(X)! SOLUTION: The region D is bounded by three curves: two spirals (polar graphs) and the line y = 0, x > 0. They are the images of the lines r =08, r = 20, and 0 = 27 in the polar plane as shown in the right panel of Figure 14.16. These lines form the boundaries of D'. An algebraic description of D' as a vertically simple region is convenient to use, (r,0) E D' if rb1t\(O) =0 < r < 20 = rop(O) and 0 E [0, 2w] =[01, 02]. Hence, A(D)= fdA f rdA' ffd rd0d -/020 4w3. D D' 6 ZD  101. DOUBLE INTEGRALS IN POLAR COORDINATES 337 FIGURE 14.16. An illustration to Example 14.14. Left: The integration region D lies between two spirals. It is not simple in any direction. Right: The region D' in the polar plane whose image is D. The region D' is simple and is bounded by straight lines. EXAMPLE 14.15. Find the volume of the part of the solid bounded by the cone z = x2 + y2 and the paraboloid z = 2 - x2 - y2 that lies in the first octant. SOLUTION: The solid is shown in the left panel of Figure 14.17. The intersection of the cone (bottom boundary) and paraboloid (top bound- ary) is a circle of unit radius. Indeed, put r = x2 + y2. Then the points of intersection satisfy the condition ix2 + y2 = 2 - x2 - g2 or r = 2 - r2 or r = 1. So the projection D of the solid onto the xy plane along the z axis is the part of the disk r < 1 in the first quadrant. For any (x, y) E D, the height is h = ztop(x, y) - zbot(x, y) - 2 - r2 - r (i.e., independent of the polar angle 0). The region D is the image of the rectangle D' = [0, 1] x [0, 7/2] in the polar plane. The volume is f/h(x, y) dA f/(2 - r2 - r)r dA' 7l/2f1 5 d6(2r -r3 -- r2) r=2 101.1. Study Problem. Problem 14.4. Find the area of the four-leaved rose bounded by the polar graph r = cos(20). SOLUTION: The polar graph comes through the origin r = 0 four times when 0 = 7/4, 0 = 7/4+7/2, 0 = 7/4+7, and 0 = 7/4+37/2. These  338 14. MULTIPLE INTEGRALS z 2 + y2 1 D 1 h(x,y) 1 ry D' 1 D FIGURE 14.17. An illustration to Example 14.15. Left: The solid whose volume is sought. Its vertical projection onto the xy plane is D, which is the part of the disk r < 1 in the first quadrant. At a point (x, y) in D, the height h(x, y) of the solid is the difference between the values of the z coordinate on the top and bottom boundaries (the paraboloid and the cone, respectively). Right: The region D' in the polar plane whose image is D. angles may be changed by adding an integer multiple of r, owing to the periodicity of cos(20). Therefore, each leaf of the rose corresponds to the range of 0 between two neighboring zeros of cos(20). Since all leaves have the same area, it is sufficient to find the area of one leaf, say, for -7/4 < 0 < r/4. With this choice, the leaf is the image of the vertically simple region D' = {(r, 8)|0 r < cos(20), -7/4 w < r/4} in the polar plane. Therefore, its area is given by the double integral 11 1 1/4 cos(20) A(D)f= dAf= rdA'f= r dr d8 JD JD' -7/4 0 - fj4 cos2(20) d = - 4(1 + cos(40)) dO 1 1 , /4 4 r/ 1- + - sin(4K) =- 4 4 l -7/4 8  101. DOUBLE INTEGRALS IN POLAR COORDINATES 339 Thus, the total area is 4A(D) = 7/2. D 101.2. Exercises. (1) Sketch the region whose area is given by the iterated integral in polar coordinates and evaluate the integral: (i) f"f rdrdO (ii) f_/2 f2acoso r d' dO (2) Convert the double integral ffD f(x, y) dA to an iterated integral in polar coordinates if (i) D is the disk x2+ y2 < R2 (ii) D is the disk x2 + y2 < ax, a > 0 (iii) D is the ring a2 < X2 + y2 G b2 (iv) D is the parabolic segment -a < x < a, x2/a y < a (3) Evaluate the double integral by changing to polar coordinates: (i) ffD xy dA, where D is the part of the ring a2 X2 + y2 < b2 in the first quadrant (ii) ffD sin(x2 + y2) dA, where D is the disk x2 + y2 < a2 (iii) ffD arctan(y/x) dA, where D is the part of the ring 0 < a2 < X 2 +y2 b2 between the lines y = v/3x and y = zv/3 in the first quadrant (iv) ffD ln(2 +y2) dA, where D is the portion of the ring 0 < a2 < x2 + y2 0 (v) ffD sin( z2 _+ y2) dA, where D is the ring r2 0 (iii) j, fa f(r,)drd, 0 < a < 27 (5) Sketch the region of integration and evaluate the integral by con- verting it to polar coordinates: (i) fI fly ex2+y2dccdy (ii) f_ f_ (cx+y)dy dc (iii) fs 102y-y2 c/c2 + y2dcc dy (iv) Ij/ f ccy dy dcc + f jj ccy dy dcc + Ii 10 z~ cy dy dcc (6) Convert the iterated integral in rectangular coordinates to an iter- ated integral in polar coordinates:  340 14. MULTIPLE INTEGRALS (i) f2fXN/ f( z2+y2) dy dx (ii) f0f f (x, y) dy dz (7) Convert the double integral to an iterated integral in polar coordi- nates: (i) ffDf(z2 + y2) dA, where D is the disk x2 + y2 1 (ii) ffDf(x2+y2)dA,where D = {(xy)|y < x Ix I 1} (iii) ffD f(y/x) dA, where D is the disk x2 + y2 0 (vii) D is bounded by the curve (x2+ y2)2= 8a2xy and (x - a)2 + (y - a)2 a2, a > 0. (9) Find the volume of the specified solid E: (i) E is bounded by the cones z = 3 z2+ y2 and z = 4 - cx2 + y2. (ii) E is bounded by the cone z = x2+ y2, the plane z = 0, and the cylinders x2 + y2 = 1, x2 +y2 4. (iii) E is bounded by the paraboloid z= 1 - x2 - y2 and the plane z=-3. (iv) E is bounded by the hyperboloid x2 + y2 - z2 = -1 and the plane z = 2. (v) Elies under the paraboloid z = x2 + y2, above the zy plane, and inside the cylinder x2 + y2 = 2x. (10) Find lim1 ff(x, y) dA, D: x2 + y2 0 in D'. Note that the Jacobian vanishes only on the boundary of D' and, hence, the hypotheses of Theorem 14.9 are fulfilled. 4. The double integral in the new variables is evaluated by Fu- bini's theorem: J x+y)5y5 dA= unv5 dA'= fu fdu vdv= 12 1 D' 0 12 6 72 This example and the example of polar coordinate show that the transformation is not one-to-one on the sets where the Jacobian van- ishes (the line L = 0 is mapped to a single point) and the inverse transformation fails to exist. It turns out that this observation is of a general nature. THEOREM 14.10. (Inverse Function Theorem). Let the transformation (u, v) - (x, y) be defined on an open set U' containing a point (Lio, vo). Suppose that the functions x(Li,v) and y(Li,v) have continuous partial derivatives in U' and the Jacobian of the transformation does not vanish at the point (Lio, vo). Then there exists an inverse transformation Li Liux, y), v =v~x, y) in an open set U containing the image point (zo0, yo) =(z(Lio, vo), y(Lio, vo)) and  348 14. MULTIPLE INTEGRALS the functions u(x, y) and v(x, y) have continuous partial derivatives in U. By this theorem, the Jacobian of the inverse transformation can be calculated as &(u, v)/&(x, y) so that the area transformation law is du dv =&|0(u, v)/&(x, y)|dz dy and the following statement holds. COROLLARY 14.3. If u = u(c, y) and v = v(c, y) is the inverse of the transformation x = x(u, v) and y = y(u, v), then &(c,y) _ 1 1 (14.11) ( , ) - 1 _1 ((u, v) -(u, v) det (/i &(X,y) v v The analogy with a change of variables in the one-dimensional case can be made. If x = f (u), where f has continuous derivative f'(u) that does not vanish, then, by the inverse function theorem for functions of one variable (Theorem 12.6), there is an inverse function u = g(x) whose derivative is continuous and g'(x) = 1/f'(u), where u = g(x). Then the transformation of the differential dc can be written in two equivalent forms, just like the transformation of the area element dA dc dy: du _ (cc y) du dv dx = f'(u) du- dxdy = ' du dv=. g'() &(u,v)av Equation (14.11) defines the Jacobian as a function of (x, y). Some- times it is technically simpler to express the product f(x, y)J(x, y) in the new variables rather than doing so for f and J separately. This is illustrated by the following example. EXAMPLE 14.17. Use a suitable change of variables to evaluate the double integral of f(x, y) = xy3 over the region D that lies in the first quadrant and is bounded by the lines y = x and y = 3x and by the hyperbolas yx = 1 and yx = 2. SOLUTION: The equations of the lines can be written in the form y/c = 1 and y/c = 3 because y,cc > 0 in D (see Figure 14.19). Note that the equations of boundaries of D depend on just two particular combinations y/c and yc that take constant values on the boundaries of D. So, if the new variables defined by the relations u u(cc, y) =y/cc and v =v(cc, y) =ccy, then the image region D' in the uv plane is a rectangle u E [1, 31 and v E [1, 2]. Indeed, the boundaries y/cc= 1 and y/cc= 3 are mapped onto the vertical lines u 1 and u =3, while the hyperbolas ycc= 1 and ycc= 2 are mapped onto the horizontal  102. CHANGE OF VARIABLES IN DOUBLE INTEGRALS 349 Y V 3x ,y=x 2 1 2 1 1 3 -x FIGURE 14.19. An illustration to Example 14.17. The transformation of the integration region D. Equations of the boundaries of D, y = 3x, y = x, xy = 2, and xy = 1, are written in the new variables u = y/x and v = xy to obtain the equations of the boundaries of D', u = 3, u = 1, v = 2, and v = 1, respectively. The correspondence between the boundaries of D and D' is indicated by encircled numbers enumerating the boundary curves. lines v = 1 and v = 2. Let us put aside for a moment the problem of expressing x and y as functions of new variables, which is needed to express f and J as functions of u and v, and find first the Jacobian as a function of x and y by means of (14.11): d ' ' ) J = det ,u vY Yv' .1 det y 1/zr) 1 2y X -1 2 The absolute value bars may be in D. The integrand becomes functions x = x(u, v) and y = this example! Hence, omitted as x and y are strictly positive fJ = x2y2/2 = v2/2. So finding the y(u, v) happens to be unnecessary in xy3 dA JD 2 111 A v2 dA' 1 3 f2d Idu 2d 1 1 7 The reader is advised to evaluate the double integral in the original rectangular coordinates to compare the amount of work needed with this solution. Q The following example illustrates how a change of variables can be used to simplify the integrand of a double integral.  350 14. MULTIPLE INTEGRALS v =1 v=u 2 D' 1 X+y=2n 1 2 u=2 x+y= 1 2 FIGURE 14.20. Left: The integration region D in Example 14.18 is bounded by the lines x + y = 1, x + y = 2, x = 0, and y = 0. Right: The image D' of D under the change of variables u = x + y and v = y - x. The boundaries of D' are obtained by substituting the new variables into the equations for boundaries of D so that x + y = 1 - u = 1, x + y =2 - =2, x= 0 - =u, and y =0 - v= -u. EXAMPLE 14.18. Evaluate the double integral of the function f (x, y) cos[(y - x)/(y + x)] over the trapezoidal region with vertices (1, 0), (2, 0), (0, 1), and (0, 2). SOLUTION: An iterated integral in the rectangular coordinates would contain the integral of the cosine function of a rational argument (with respect to either x or y), which is difficult to evaluate. So a change of variables should be used to simplify the argument of the cosine function. The region D is bounded by the lines x + y = 1, x + y = 2, x = 0, and y = 0. Put u = x + y and v = y - x so that the function in the new variables becomes f = cos(v/u). The lines x + y = 1 and x + y = 2 are mapped onto the vertical lines u = 1 and u = 2. Since y = (u + v)/2 and x = (u - v)/2, the line x = 0 is mapped onto the line v = u, while the line y = 0 is mapped onto the line v = -u. Thus, the region D' = {(u, v) - u < v 0, y > 0 defined by the transformation x = ve", y = ve-". Calculate the Ja- cobian. Determine the range of (u, v) in which the transformation is one-to-one. Find the inverse transformation and sketch coordinate curves of hyperbolic coordinates. (3) Find the conditions on the parameters of a linear transformation x = ailu+biv+ ci, y = a2i+ b2v+ c2 so that the transformation is area preserving. In particular, prove that the rotations discussed in Study Problem 11.2 are area preserving. (4) Find the image D of the specified region D' under the given trans- formation: (i) D' = [0, 1] x [0, 1] and the transformation is x = u, y = v(1 - Li2). (ii) D' is the triangle with vertices (0, 0), (1, 0), and (1, 1), and the transformation is x = v2, y Li. (iii) D' is the region defined by the inequality |ul + |vl < 1, and the transformation is x i= + v, y i=7 - v. (5) Find a linear transformation that maps the triangle D' with vertices (0, 0), (0, 1), and (1, 0) onto the triangle D with vertices (0, 0), (a, b), and (b, a), where a and b are positive, nonequal numbers. Use this transformation to evaluate the integral of f~x, y) =bzc - ay over the  354 14. MULTIPLE INTEGRALS triangle D. (6) Evaluate the double integral using the specified change of variables: (i) ffD(8x + 4y) dA, where D is the parallelogram with vertices (3, -1), (-3, 1), (-1, 3), and (5, 1); the change of variables is x (v - 3u)/4, y = (u + v)/4 (ii) ffD(x2 - xy + y2) dA, where D is the region bounded by the ellipse x2-xy+y2 = 1; the change of variables is x = u-v/v/3, y = u+v/v/3 (iii) ffDg(x2 - y2)-1/2 dA, where D is in the first quadrant and bounded by hyperbolas x2 _ y2 = 1, X2 _2 = 4 and by the lines x = 2y, x = 4y; the change of variables is x = u cosh v, y =iu sinh v (iv) f fD e(4/Y (xc+ y)3/y2 dA, where D is bounded by is the lines y = x, y = 2x, x+y = 1, and x+y = 2; the change of variables c=/y, v = x+y Hint: Follow the procedure based on (14.11) as illustrated in Example 14.17. (7) Find the image D' of the square a < c < a+h, b < y < b+h, where a, b, and h are positive numbers, under the transformation u = y2/x, v = cy. Find the ratio of the area A(D') to the area A(D). What is the limit of the ratio when h -- 0? (8) Use the specified change of variables to convert the iterated integral to an iterated integral in the new variables: (i) fjfjf(x,y)dydz, where 0 < a< b and0 < a <#,3if u = x and v = y/z (ii) ffj f(x,y)dyd if u x+ y and v x -y (9) Convert the double integral ffD f(x, y) dA to an iterated integral in the new variables, where D is bounded by the curve cc +V y va, a> 0, and the lines x = 0, y =0 if x = ucos4 v and y = usin4 v. (10) Evaluate the double integral by making a suitable change of vari- ables: (i) f fD yz2 dA, where D is in the first quadrant and bounded by the curves zy = 1, zy = 2, yz2 = 1, and yz2 = 2 (ii) ffD ex-y dA, where D is given by the inequality Icc +|y < 1 (iii) IfID(1 + 3cc2) dA, where D is bounded by the lines cc + y =1, cc + y =2 and by the curves y - cc3 =0, y - cc3 1 (iv) IID(y + 2cc2) dA, where the domain D is bounded by two parabolas, y =cx2, y =cc2 + 2 and by two hyperbolas ccy -1 (cc < 0), ccy =1 (cc > 0)  102. CHANGE OF VARIABLES IN DOUBLE INTEGRALS 355 (v) ffD(c + y)/x2 dA, where D is bounded by four lines y = x, y= 2x, y+z= 1, and y+x= 2 (vi) ffD y - x/(x+y), where D is the square with vertices (0, 2a), (aa), (2a, 2a), and (a, 3a) with a > 0 (vii) ffD cos(x2/a2 + y2/b2) dA, where D is bounded by the ellipse x2/a2 + y2/b2=1 (viii) ffD (x + y) dA, where D is bounded by x2 + y2 =cX + y (ix) ffD(Iz|+ lyl) dA, where D is defined by Iz|+ ly| <1 (x) ffD(1 - a - i-1/2 dA, where D is bounded by the ellipse x2/a2 + y2/b2 1 (11) Let f be continuous on [0, 1]. Show that ffD f(x + y) dA = f0 uf (u) du if D is the triangle with vertices (0, 0), (0, 1), and (1, 0). (12) Use a suitable change of variables to reduce the double integral to a single integral: (i) ffD f(x + y) dA, where D is defined by Icc+ ly| <1 (ii) ffD f(a + by + c) dA, where D is the disk x2 + y2 < 1 and a2 + b2#g 0 (iii) ffD f(xy) dA, where D lies in the first quadrant and is bounded by the curves zy = 1, zy = 2, y = x, and y = 4x (13) Let n and m be positive integers. Prove that if ffD cnym dA = 0, where D is bounded by an ellipse x2/a2 + y2/b2 = 1, then at least one of the numbers n and m is odd. (14) Suppose that the level curves of a function f(x, y) are simple closed curves and the region D is bounded by two level curves f(x, y) a and f(x, y) = b. Prove that f (,y)F'(u) du, where F(u) is the area of the region between the curves f(x, y) = a and f(x, y) = u. Hint: Split the region D by infinitesimally close level curves of the function f. (15) Use the generalized polar coordinates with a suitable choice of parameters to find the area of a region D if (i) D is bounded by the curves x2/a2 + y3b3 =X2 + y2 and lies in the first quadrant. (ii) D is bounded by the curves c3/a3 _/n 3/b3 =cx2/2 _ y2/k2 and lies in the first quadrant. (iii) D is bounded by the curve (cc/a + y/b)5 =cx2 y2/c4. (16) Use the double integral and a suitable change of variables to find the area of D if  356 14. MULTIPLE INTEGRALS (i) D is bounded by the curves x+y = a, x+y = b, y = mx, and y = nx and lies in the first quadrant (ii) D is bounded by the curves y2 =2ax, y2 = 2bz, x2 = 2cy, and x2 =2ky,where0 0 and b> 0 (iv) D is bounded by the curves (x/a)2/3 + (y/b)2/3 - 1, (x/a)2/3 + (y/b)2/3 = 4, x/a = y/b, and 8x/a = y/b and lies in the first quadrant (v) D is bounded by the ellipses x2/cosh2 u + y2/ sinh2 Li=1 where ui =u1 and u = L2 > ui1, and by the hyperbolas x2/ cos2 v-y2/ sin2 v = 1, where v = v1 and v= v2 > v1. Hint: Consider the transformation x = cosh u cos v, y = sinh u sin v. 103. Triple Integrals Suppose a solid region E is filled with an inhomogeneous material. The latter means that, if a small volume AV of the material is taken at two distinct points of E, then the masses of these two pieces are different, despite the equality of their volumes. The inhomogeneity of the material can be characterized by the mass density as a function of position. Let Am(r) be the mass of a small piece of material of volume AV cut out around a point r. Then the mass density is defined by ojr 27'~OAm(r) a(r) = lim .mr sv-o/AV The limit is understood in the following sense. If R is the radius of the smallest ball that contains the region of volume AV, then the limit means that R - 0 (i.e., roughly speaking, all the dimensions of the piece decrease simultaneously in the limit). The mass density is measured in units of mass per unit volume. For example, the value a(r) = 5 g/cm3 means that a piece of material of volume 1 cm3 cut out around the point r has a mass of 5 g. Suppose that the mass density of the material in a region E is known. The question is: What is the total mass of the material in E? A practical answer to this question is to partition the region E so that each partition element E,, p =1, 2, ..., N, has a mass Amp. The total mass is M =3 J Amp. If a partition element E, has a volume AV,, then Am, ~u(r,) AV, for some r, E E (see the left panel of Figure 14.22). If R, is the radius of the smallest ball that contains E,, put RN =maX R,. Then, by increasing the number N of  103. TRIPLE INTEGRALS 357 E (x, y, -z) FIGURE 14.22. Left: A partition element of a solid re- gion, where rp is the position vector of a sample point in it. If a(r) is the mass density, then the mass of the partition element is Am(rp) ~i (rp) AVp, where AV is the volume of the partition element. The total mass is the sum of Am(rp) over the partition of the solid E as given in (14.12). Right: An illustration to Example 14.17. A ball is symmetric un- der the reflection about the xy plane: (x, y, z) - (x, y, -z). If the function f is skew-symmetric under this reflection, f(x, y, -z) = -f(x, y, z), then the triple integral of f over the ball vanishes. partition elements so that Rp < RN -- 0 as N -- oc, the approximation Amp ~ u(rp) AV becomes more and more accurate by the definition of the mass density because AV -- 0 for all p. So the total mass is N (14.12) M = lim EJ(r) AV , (RN-*O) p1 which is to be compared with (14.1). In contrast to (14.1), the summa- tion over the partition should include a triple sum, one sum per each direction in space. This gives an intuitive idea of a triple integral. Its abstract mathematical construction follows exactly the footsteps of the double-integral construction. 103.1. Definition of a Triple Integral. Smooth Surface. In Section 85.5, a surface was defined as a continuous deformation of an open set in a plane that has continuous inverse. A small piece of a surface can be viewed as the graph of a continuous function of two variables. Similarly to the notion of a smooth curve, a smooth surface can be defined. If the graph has a tangent plane at every point and the normal to the tangent plane changes continuously along the graph, then the surface is called smooth. Consider a level set  358 14. MULTIPLE INTEGRALS of a function g(r) of three variables r = (x, y, z). Suppose that g has continuous partial derivatives and the gradient Vg does not vanish. As explained in Section 93.2 (see the discussion of Theorem 13.16), a level set g(r) = k is a surface whose normal vector is the gradient Vg. If g has continuous partial derivatives, then the components of the normal are continuous. Thus, a surface is said to be smooth in a neighborhood of a point ro if it coincides with a level set g(r) = g(ro) of a function g that has continuous partial derivatives and whose gradient does not vanish in at ro. A surface is smooth if it is smooth in a neighborhood of its every point. A surface is piecewise smooth if it consists of several smooth pieces adjacent along smooth curves. Rectangular Partition. A region E in space is assumed to be closed and bounded; that is, it is contained in a ball of some (finite) radius. The boundaries of E are assumed to be piecewise-smooth surfaces. The region E is then embedded in a rectangular box RE =[a, b] x [c, d] x [s, q], that is, x c [a, b], y E [c, d], and z E [s, q]. If f(r) is a bounded function on E, then it is extended to RE by setting its values to 0 outside E. The rectangle RE is partitioned by the coordinate planes x=x =a+iAx, i =0,1,...,N1, where Ax =(b-a)/N1; y=y c+j Ay, i = 0,1, ..., N2, where Ay = (d-c)/N2; and z = z2 = s+k Az, k = 0, 1, ..., N3, where Az = (q - s)/N3. The volume of each partition element is a rectangle RiJk of volume AV =Ax Ay Az. The total number of rectangles is N = N1N2N3. Upper and Lower Sums. By analogy with Definition 14.2, the lower and upper sums are defined. Put MiJk = sup f(r) and myjk= inf f(r), where the supremum and infimum are taken over the partition rectangle Rigk. Then the upper and lower sums are N1 N2 N3 N1 N2 N3 U(f,N) = Z MgkAV, L(f,N) = Z mk AV, i=1 j=1 k=1 i=1 j=1 k=1 where N = (N1, N2, N3). So the upper and lower sums are triple se- quences (a rule that assigns a number anmk to an ordered triple of integers (n, m, k) is a triple sequence). The limit of a triple sequence is defined similarly to the limit of a double sequence (amm is replaced by amk in Definition 14.4). DEFINITION 14.13. (Triple Integral). If the limits of the upper and lower sums exist as N1,2,3 -m oo (or (Ax, Ay, Az) -~ (0, 0, 0)) and coincide, then f is said to be Riemann  103. TRIPLE INTEGRALS 359 103. TRIPLE INTEGRALS 359 integrable on E, and the limit of the upper and lower sums //f(x,y, z)dV=lim U(f, N) = r l L(f, N) fffx~~~d~ N- oo N- oo is called the triple integral of f over the region E. The limit is understood as a three-variable limit (Ax, Ay, Az) - (0, 0, 0) or as the limit of a triple sequence. 103.2. Properties of Triple Integrals. The properties of triple integrals are the same as those of the double integral discussed in Section 98; that is, the linearity, additivity, positivity, integrability of the absolute value |f|, and upper and lower bounds hold for triple integrals. Continuity and Integrability. The relation between continuity and inte- grability is pretty much the same as in the case of double integrals. THEOREM 14.12. (Integrability of Continuous Functions). Let E be a closed, bounded spatial region whose boundaries are piecewise- smooth surfaces. If a function f is continuous on E, then it is integrable on E. Furthermore, if f has bounded discontinuities only on a finite number of smooth surfaces in E, then it is also integrable on E. In particular, a constant function is integrable, and the volume of a region E is given by the triple integral V(E) =JJjdV. If m f(r) < M for all r in E, then mV(E) ff f dV MV(E). The Integral Mean Value Theorem. The integral mean value theorem (Theorem 14.3) is extended to triple integrals. If f is continuous in E, then there is exists a point ro in E such that Iff(r) dV =V(E)f(ro). Its proof follows the same line of reasoning as in the case of double integrals.  360 14. MULTIPLE INTEGRALS Riemann Sums. If a function f is integrable, then its triple integral is the limit of a Riemann sum, and its value is independent of the partition of E and a choice of sample points in the partition elements: N (14.13) f (r) dV lim f (r) AV . (RN- O) P= This equation can be used for approximations of triple integrals, when evaluating the latter numerically just like in the case of double integrals. Symmetry. If a transformation in space preserves the volume of any region, then it is called volume preserving. Obviously, rotations, reflec- tions, and translations in space are volume-preserving transformations. Suppose that, under a volume-preserving transformation, a region E is mapped onto itself; that is, E is symmetric relative to this transforma- tion. If rs E E is the image of r E E under this transformation and the integrand is skew-symmetric, f(rs) = -f(r), then the triple integral of f over £ vanishes. EXAMPLE 14.20. Evaluate the triple integral of f(x, y, z) x2 sin(y4z) + 2 over a ball centered at the origin of radius R. SOLUTION: Put g(x, y, z) = x2 sin(y4z) so that f = g + h, where h = 2 is a constant function. By the linearity property, the triple integral of f is the sum of triple integrals of g and h over the ball. The ball is symmetric relative to the reflection transformation (x, y, z) - (x, y, -z), whereas the function g is skew-symmetric, g(x, y, -z) -g(x, y, z). Therefore, its triple integral vanishes, and IlL fdV =JJgdV+JJjhdV =0+2fffdV =2V(E) = 8,rR3/3. ED One can think of the numerical value of a triple integral of f over E as the total amount of a quantity distributed in the region E with the density f (the amount of the quantity per unit volume). For example, f can be viewed as the density of electric charge distributed in a dielectric occupying a region E. The total electric charge stored in the region £ is then given by triple integral of the density over E. The electric charge can be positive and negative. So, if the total positive charge in £ is exactly the same as the negative charge, the triple integral vanishes.  103. TRIPLE INTEGRALS 361 103. TRIPLE INTEGRALS 361 ztop (x, y) zbot(X, y) Xy FIGURE 14.23. Left: An algebraic description of a solid region simple in the direction of the z axis. The solid E is vertically projected into the xy plane: every point (x, y, z) of E goes into the point (x, y, 0). The projection points form the region Dxy. Since E is simple in the z direction, for every (x, y, 0) in Dxy, the z coordinate of the point P(x, y, z) in E ranges over the interval zbot(x, y) < z < ztop(x, y). In other words, E lies between the graphs z = zbot(x, y) and z = ztop(x, y). Right: An illustration to the algebraic description (14.14) of a solid E as simple in the y direction. E is projected along the y axis to the xz plane, forming a region Dxz. For every (x, 0, z) in D, the y coordinate of the point P(x, y, z) in E ranges over the interval Ybot (x, z) < Y ytop(X, z). In other words, E lies between the graphs Y Ybot (x, z) and y = ytop (x, z). 103.3. Iterated Triple Integrals. Similar to a double integral, a triple integral can be converted to a triple iterated integral, which can then be evaluated by means of ordinary single-variable integration. DEFINITION 14.14. (Simple Region). A spatial region E is said to be simple in the direction of a vector v if any straight line parallel to v intersects E along at most one straight line segment. A triple integral can be converted to an iterated integral if E is simple in a particular direction. If there is no such direction, then E should be split into a union of simple regions with the consequent use of the additivity property of triple integrals. Suppose that v = e3; that is, E is simple along the z axis. Then the region E admits the following description: E £ {(x, y, z) zbot(x, y) z < ztop(x, y), (x, y) E Dy}.  362 14. MULTIPLE INTEGRALS Indeed, consider all lines parallel to the z axis that intersect E. These lines also intersect the xy plane. The region Dxy in the xy plane is the set of all such points of intersection. One might think of Dxy as a shadow made by the solid E when it is illuminated by rays of light parallel to the z axis. Take any line through (x, y) E Dxy parallel to the z axis. By the simplicity of E, any such line intersects E along a single segment. If Zbot and ztop are the minimal and maximal values of the z coordinate along the intersection segment, then, for any (x, y, z) E E, Zbot < z < zop and any (x, y) E Dxy. Naturally, the values Zbot and ztop may depend on (x, y) E Dxy. Thus, the region E is bounded from the top by the graph z = ztop(x, y) and from the bottom by the graph z = zbot(x, y). If E is simple along the y or x axis, then E admits similar descriptions: (14.14) E ={(x, y, z) ybot(x, z) y ytop(x, z), (x, z) E Dz (14.15) E =_{(x, y, z) Xbot(y, z) x x ztop(Y, z), (y, z) E Dyz} where Dzz and Dyz are projections of E into the xz and yz planes, respectively; they are defined analogously to Dxy. According to (14.13), the limit of the Riemann sum is independent of partitioning E and choosing sample points (a generalization of The- orem 14.5 to the three-dimensional case is trivial as its proof is based on Theorem 14.4, which holds in any number of dimensions). Let DP, p = 1, 2, ..., N, be a partition of the region Dxy. Consider a portion E, of E that is projected on the partition element D,; E, is a column with DP its cross section by a horizontal plane. Since E is bounded, there are numbers s and q such that s < zbot(x, y) zop(x, y) < q for all (x, y) E Dxy; that is, E always lies between two horizontal planes z = s and z = q. Consider slicing the solid E by equispaced horizontal planes z = s + k Az, k = 0, 1, ..., N3, Az = (q - s)/N3. Then each column E, is partitioned by these planes into small regions Epk. The union of all Epk forms a partition of E, which will be used in the Rie- mann sum (14.13). The volume of Epk is AVk =Az AAp, where AAp is the area of D,. Assuming, as usual, that f is defined by zero values outside E, sample points may be selected so that, if (xv, yp, 0) E D, then (xc,yp, zi) E Ek, that is, zk_1 ze zk for k =1 ,.. 3 The three-variable limit (14.13) exists and hence can be taken in any particular order (recall Theorem 14.6). Take first the limit N3 - o or Az -a 0. The double limit of the sum over the partition of D2 is understood as before; that is, as N -~ 0, the radii R, of smallest disks  103. TRIPLE INTEGRALS 363 103. TRIPLE INTEGRALS 363 containing D, go to 0 uniformly, R < RN - 0. Therefore, N N3 // V= lim lim f (xp,,yp, z*) z Ap E(R NO)Np nxk=1 N ztop(op~yp) = lim f(xy,,z)dz) AA (RNW zbot(Xp,Yp) because, for every (xc, y,) E Dxy, the function f vanishes outside the interval z E [zbot(x,, yr), ztop(,, yr)]. The integration off with respect to z over the interval [zbot(x, y), zop(x, y)] defines a function F(x, y) whose values F(xp, y,) at sample points in the partition elements D, appear in the parentheses. A comparison of the resulting expression with (14.3) leads to the conclusion that, after taking the second limit, one obtains the double integral of F(x, y) over Dxy. THEOREM 14.13. (Iterated Triple Integral). Let f be integrable on a solid region E bounded by a piecewise smooth surface. Suppose that E is simple in the z direction so that it is bounded by the graphs z = zbot (x, y) and z = ztop(x, y) for (x, y) E Dy. Then // //f//Ztop (x,y) (x, yz) dV = f (x, y, z) dz dA EJDxy zbot (x,y) f/ F(x,Y)dA. Dxy 103.4. Evaluation of Triple Integrals. In practical terms, an evaluation of a triple integral over a region E is carried out by the following steps: Step 1. Determine the direction along which E is simple. If no such direction exists, split E into a union of simple regions and use the additivity property. For definitiveness, suppose that E happens to be z simple. Step 2. Find the projection DxY of E into the xy plane. Step 3. Find the bottom and top boundaries of E as the graphs of some functions z = Zbot(x, y) and z = ztop(x, y). Step 4. Evaluate the integral of f with respect to z to obtain F(x, y). Step 5. Evaluate the double integral of F(x, y) over DxY by converting it to a suitable iterated integral. Similar iterated integrals can be written when £ is simple in the y or x direction. According to (14.14) (or (14.15)), the first integration is carried out with respect to y (or zc), and the double integral is evaluated over Dz (or Dy) If £ is simple in any direction, then any of the  364 14. MULTIPLE INTEGRALS iterated integrals can be used. In particular, just like in the case of double integrals, the choice of an iterated integral for a simple region E should be motivated by the simplicity of an algebraic description of the top and bottom boundaries or by the simplicity of the integrations involved. Technical difficulties may strongly depend on the order in which the iterated integral is evaluated. Fubini's theorem can be extended to triple integrals. THEOREM 14.14. (Fubini's Theorem). Let f be integrable on a rectangular region E =[a, b] x [c, d] x [s, q]. Then ffdV jbdq and the iterated integral can be evaluated in any order. Here Dx= [a, b] x [c, d], and the top and bottom boundaries are the planes z = q and z = s. Alternatively, one can take DYz = [c, d] x [s, q], Xbot(y, z) = a, and ztop(y, z) = b to obtain an iterated integral in a different order (where the x integration is carried out first). In particular, if f (x, y, z) = g(x)h(y)w(z), then f (x, y,z) dV = g(x)dcch(y) dyfw(z) dz, which is an extension of the factorization property stated in Corollary 14.2 to triple integrals. EXAMPLE 14.21. Evaluate the triple integral of f (x, y, z) x=zy2z3 over the rectangle E = [0, 2] x [1, 2] x [0,3]. SOLUTION: By Fubini's theorem, fffxy2z3dv fxd f2 dfz3 dz 2.- (7/3) .9 42. EXAMPLE 14.22. Evaluate the triple integral of f(x, y, z) (x2 + y2)z over the portion of the solid bounded by the cone z = z2 + y2 and paraboloid z = 2 - x2 - y2 in the first octant. SOLUTION: Following the step-by-step procedure outlined above, the integration region is z simple. The top boundary is the graph of ziop(xc, y) =2 - xc2 _ y2, and the graph of zb(ot(x, y) = 2 2+ y2 is the bottom boundary. To determine the region Dhu, note that it has to be bounded by the projection of the curve of the intersection of the cone and paraboloid onto the zcy plane. The intersection curve is de- fined by zbot =zi0p or r =2-r2, where r = cc2 + y2, and hence r =1,  103. TRIPLE INTEGRALS 365 X z 1 ztop = 1 - x2 x=y2 +z2 1DX 1 i 1 Zbot = 0 21 y Dy z x FIGURE 14.24. Left: The integration region in Example 14.23. The x axis is vertical. The region is bounded by the plane x = 4 (top) and the paraboloid x = y2 + z2 (bottom). Its projection into the yz plane is the disk of radius 2 as the plane and paraboloid intersect along the circle 4 = y2 + z2. Right: An illustration to Study Problem 14.6. which is the circle of unit radius. Since E is in the first octant, Dxy is the quarter of the disk of unit radius in the first quadrant. One has (x2 + y2)z dV ff(x2 + y2) f z dz d A JE JD xy X2+22 1 (x2 + y2)[(2 - x2 _ Y2)2 - (x2 + y2)] dA 1 / 2 1 1 d r2[(2 - r2)2 - r2]rdr 2 0 0 7r u[2 + - u=77 =8 0ou( ) ]d 96' where the double integral has be transformed into polar coordinates because Dxy becomes the rectangle D'Y = [0, 1] x [0, 7/2] in the polar plane. The integration with respect to r is carried out by the substitu- tion u = r2 EXAMPLE 14.23. Evaluate the triple integral of f(x, y, z) = y2 + z2 over the region E bounded by the paraboloid x = y2 + z2 and the plane x = 4. SOLUTION: It is convenient to choose an iterated integral for E de- scribed as an x simple region (see (14.15)). There are two reasons for doing so. First, the integrand f is independent of x, and hence  366 14. MULTIPLE INTEGRALS the first integration with respect to x is trivial. Second, the bound- aries of E are already given in the form required by (14.15), that is, Xbot(y, z) = y2 + z2 and Xtop(y, z) = 4. The region Dyz is determined by the curve of intersection of the boundaries of E, xtop = Xbot or y2 + z2 = 4. Therefore, Dyz is the disk or radius 2 (see the left panel of Figure 14.24). One has JJf /y2 + z2V ff y2 + z2 f z dxA =ff /y2 +z2[4 _ (y2 +z2)]dA J Dyz f d6fr[4 - r2]rdr/22g2128 o o 15' where the double integral over Dyz has been converted to polar coor- dinates in the yz plane. D 103.5. Study Problems. Problem 14.6. Evaluate the triple integral of f(x, y, z) = z over the region E bounded by the cylinder x2 + z2 = 1 and the planes z = 0, y = 1, and y = x in the first octant. SOLUTION: The region is z simple and bounded by the xy plane from the bottom (i.e., Zbot(X, y) = 0) and by the cylinder from the top (i.e., zop(x, y) =v1 - x2) (by taking the positive solution of x2 + z2 1). The integration region is shown in the right panel of Figure 14.24. The region Dxy is bounded by the lines of intersection of the planes x = 0, y = x, and of the planes x = 0, y = x, and y = 1. Thus, Day is the triangle bounded by the lines x = 0, y = 1, and y = x. One has fffzdV= f l z dz dA fj (1 x2)dA=1 (1 -4x2)1dyd 2, where the double integral has been evaluated by using the description of D2, as a vertically simple region, ybot =c x y 1 =ytop for all cc E [0, 11 [a, b].D  103. TRIPLE INTEGRALS 367 103. TRIPLE INTEGRALS 367 Problem 14.7. Evaluate the triple integral of the function f (x, y, z) xy2z3 over the region E that is a ball of radius 3 centered at the origin with a cubic cavity [0, 1] x [0, 1] x [0, 1]. SOLUTION: The region E is not simple in any direction. The additivity property must be used. Let E1 be the ball and let E2 be the cavity. By the additivity property, fff cy2)dz3V ffi cry2 3dV - fff 2 3dV = 0 - x df y2 dy z3dz 24. The triple integral over E1 vanishes by the symmetry argument (the ball is symmetric under the reflection (-x, y, z) - (-x, y, z) whereas f (-x, y, z) = -f (x, y, z)). The second integral is evaluated by Fubini's theorem. D 103.6. Exercises. (1) Evaluate the triple integral over the specified solid region by con- verting it to an appropriate iterated integral: (i) fffE(xy - 3z2) dV, where E= [0, 1] x [1,2] x [0, 2] (ii) fffE 6z dV, where E is defined by the inequalities 0 < x < z, 0 0. Hint: Use the gener- alized polar coordinates to evaluate the integral. (see Study Problem 14.5) (vii) E is bounded by the surfaces z = x + y, z =ccy, c + y = 1, c =0, and y =0. (viii) E is bounded by the surfaces x2 + z2 = a2, c + y =ta, and c - y =ta. (ix) E is bounded by the surfaces az = x2 + y2, and z = a - x - y and by the coordinate surfaces, where a > 0. (x) E is bounded by the surfaces z = 6-2 -y2 and z = c/2 + y2. (3) Use symmetry and other properties of the triple integral to evalu- ate: (i) fifE 24cy2z3 dV, where E is bounded by the elliptic cylinder (c/a)2 + (y/b)2 = 1 and by the paraboloids z = +[c - (c/a)2 - (y/b)2] and has the rectangular cavity c E [0, 1], y E [-1, 1], and z E [0, 1]. Assume that a, b, and c are larger than 2. (ii) fffE(sin2(cz)-cos2(xy)) dV, where Elies between the spheres: 1y Kro * ----------- -O FIGURE 14.25. Coordinate surfaces of cylindrical coordi- nates: cylinders r = ro, half-planes O= o bounded by the z axis, and horizontal planes z = zo. Any point in space can be viewed as the point of intersection of three coordinate surfaces. 0 E [0, 27), and z E (-oc, oc). The inverse transformation is given by r= = x2+y2, 0=tan-1(y/x), z=z, where the value of tan-- is taken according to the quadrant in which the pair (x, y) belongs (see the discussion of polar coordinates). It maps any region E in the Euclidean space spanned by (x, y, z) to the image region E'. To find the shape of E', as well as its algebraic description, the same strategy as in the two-variable case should be used: boundaries of E -> boundaries of E' under the transformation (14.16) and its inverse. Is is particularly im- portant to investigate the shape of coordinate surfaces of cylindrical coordinates, that is, surfaces on which each of the cylindrical coordi- nates has a constant value. If E is bounded by coordinate surfaces only, then it is an image of a rectangular box E', which is the simplest, most desirable shape when evaluating a multiple integral. The coordinate surfaces of r are cylinders, r = c2+ y2 = ro or x2+y2 = ro. In the xy plane, the equation 0 = o defines a ray from the origin at the angle 60 to the positive x axis counted counterclockwise. Since 0 depends only on xc and y, the coordinate surface of 0 is the half-plane bounded by the z axis that makes an angle 0 with the czz plane (it is swept by the ray when the latter is moved parallel up and down along the z axis). Since the z coordinate is not changed, neither changes its coordinate surfaces; they are planes parallel to the zcy plane.  104. TRIPLE INTEGRALS IN CYLINDRICAL COORDINATES 371 So the coordinate surfaces of cylindrical coordinates are r = ro x2 + y2 = r2 (cylinder), 0 0= o y cos 00o= x sin 00 (half-plane), z= zo z = zo(plane). The coordinate surfaces of cylindrical coordinates are shown in Fig- ure 14.25. A point in space corresponding to an ordered triple (ro, 00, z0) is an intersection point of a cylinder, a half-plane bounded by the cylin- der axis, and a plane perpendicular to the cylinder axis. EXAMPLE 14.24. Find the region E' whose image under the trans- formation (14.16) is the solid region E that is bounded by the paraboloid z = x2 + y2 and the planes z = 4, y = x, and y = 0 in the first octant. SOLUTION: In cylindrical coordinates, the equations of boundaries be- come, respectively, z = r2, z = 4, 0 w= r/4, and 0 = 0. Since E lies below the plane z = 4 and above the paraboloid z = r2, the range of r is determined by their intersection 4 = r2 or r = 2 as r > 0. Thus, E'={(r,0,z)|r2 < z < 4, (r,0) E [0, 2] x [0,7/4] . 104.2. Triple Integrals in Cylindrical Coordinates. To change variables in a triple integral to cylindrical coordinates, one has to consider a partition of the integration region E by coordinate surfaces, that is, by cylinders, half-planes, and horizontal planes, which corresponds to a rectangular partition of E' (the image of E under the transforma- tion from rectangular to cylindrical coordinates). Then the limit of the corresponding Riemann sum (14.13) has to be evaluated. In the case of cylindrical coordinates, this task can be accomplished by simpler means. Suppose E is z simple so that, by Theorem 14.13, the triple integral can be written as an iterated integral consisting of a double integral over D2 and an ordinary integral with respect to z. The transforma- tion (14.16) merely defines polar coordinates in the region Dzy So, if D2 is the image of D'in the polar plane spanned by pairs (r, 0), then, by converting the double integral to polar coordinates, one infers  372 14. MULTIPLE INTEGRALS that ffftff(x, y, z) dV f f t (x , )dV=f rcos 0, r sin 0, zjr dz dA' E DG bot (r,) (14.17) = f (r cos 0, r sin 0, z)r dV', where the region E' is the image of E under the transformation from rectangular to cylindrical coordinates, E' ={(r, 0, z) zbot(r, ) z ztop(r, 0) , (r, 0) E D'D}, and z = zbot (r,0 ), z = ztop(r, 0) are equations of the bottom and top boundaries of E written in polar coordinates by substituting (14.16) into the equations for boundaries written in rectangular coordinates. Note that dV' = dz dr dO = dz dA' is the volume of an infinitesimal rectangle in the space spanned by the triples (r, 0, z). Its image in the space spanned by (x, y, z) lies between two cylinders whose radii differ by dr, between two half-planes with the angle dO between them, and between two horizontal planes separated by the distance dz as shown in the left panel of Figure 14.26. So its volume is the product of the area dA of the base and the height dz, dV = dz dA = r dz dA', according to the area transformation law for polar coordinates, dA = r dA'. So the volume transformation law for cylindrical coordinates reads dV = J dV' , J = r, where J = r is the Jacobian of transformation to cylindrical coordi- nates. Cylindrical coordinates are advantageous when the boundaries of E contain cylinders, half-planes, horizontal planes, or any surfaces with axial symmetry. A set in space is said to be axially symmetric if there is an axis such that any rotation about it maps the set onto itself. For example, circular cones, circular paraboloids, and spheres are axially symmetric. Note also that the axis of cylindrical coordinates may be chosen to be the x or y axis, which would correspond to polar coordi- nates in the yz or xz plane. EXAMPLE 14.25. Evaluate the triple integral of f (x, y, z) = x2z over the region E bounded by the cylinder x2 + y2 = 1, the paraboloid z= x2 +y2, and the plane z=0. SOLUTION: The solid £ is axially symmetric because it is bounded from below by the plane z =0, by the circular paraboloid from above, and the side boundary is the cylinder. Hence, De is a disk of unit radius, and D'is a rectangle, (r, 0) E [0, 11 x [0, 27r]. The top and  104. TRIPLE INTEGRALS IN CYLINDRICAL COORDINATES 373 Az DX FIGURE 14.26. Left: A partition element of the partition of E by cylinders, half-planes, and horizontal planes (coordi- nate surfaces of cylindrical coordinates). The partition is the image of a rectangular partition of E'. Keeping only terms linear in the differentials dr Ar, d= AO, dz = Az, the volume of the partition element is dV = dA dz = rd rdOdz r dV', where dA = r dr dO is the area element in the polar coordinates. So the Jacobian of cylindrical coordinates is J = r. Right: An illustration to Example 14.25. bottom boundaries are z = ztop(r, ) = r2 and z = zbot(r, ) = 0. Hence, 27 1 r2 x2z dV j j r2 cos2 0 z rd zd rd6 =2 cos2 0 d8 r7dr = ,6 where the double-angle formula, cos2 0 = (1 +cos(20))/2, has been used to evaluate the integral. 104.3. Spherical Coordinates. Spherical coordinates are introduced by the following geometrical procedure. Let (x, y, z) be a point in space. Consider a ray from the origin through this point. Any such ray lies in the half-plane corresponding to a fixed value of the polar angle 0. Therefore, the ray is uniquely determined by the polar angle 0 and the angle # between the ray and the positive z axis. If p is the distance from the origin to the point (x, y, z), then the ordered triple of numbers (p, 0, 0) defines uniquely any point in space. The triples (p, 0, 0) are called spherical coordinates in space. To find the transformation law from spherical to rectangular co- ordinates, consider the plane that contains the z axis and the ray  374 14. MULTIPLE INTEGRALS 374 14. MULTIPLE INTEGRALS O7P' p $ r 0 Lox O p/ O x FIGURE 14.27. Spherical coordinates and their relation to the rectangular coordinates. A point P in space is defined by its distance to the origin p, the angle # between the positive z axis and the ray OP, and the polar angle 0. from the origin through P = (x, y, z) and the rectangle with vertices (0, 0, 0), (0, 0, z), P' = (x, y, 0), and (x, y, z) in this plane (see Fig- ure 14.27). The diagonal of this rectangle has length p (the distance between (0, 0, 0) and (x, y, z)). Therefore, its vertical side has length z = p cos # because the angle between this side and the diagonal is 0. Its horizontal side has length p sin #. On the other hand, it is also the distance between (0, 0, 0) and (x, y, 0), that is, r = p sin 0, where r = x2 + y2. Since x = r cos O and y = r sin O, it is concluded that (14.18) x = psin#cosO, y = psin#sinO, z = pcos#. The inverse transformation follows from the geometrical interpretation of the spherical coordinates: (14.19) p z 2 + y2 + z2 , cot -Z z z , tan 8 = -. If (x, y, z) span the entire space, the maximal range of the variable p is the half-axis p E [0, oc). The variable 0 ranges over the interval [0, 27) as it coincides with the polar angle. To determine the range of the azimuthal angle 0, note that an angle between the positive z axis and any ray from the origin must be in the interval [0, 7]. If # = 0, the ray coincides with the positive z axis. If # = 7, the ray is the negative z axis. Any ray with # = 7/2 lies in the xy plane. Coordinate Surfaces of Spherical Coordinates. All points that have the same value of p = po form a sphere of radius po centered at the origin  104. TRIPLE INTEGRALS IN SPHERICAL COORDINATES 375 because they are at the same distance po from the origin. Naturally, the coordinate surfaces of 0 are the half-planes described earlier when discussing cylindrical coordinates. Consider a ray from the origin that has the angle # =_#o with the positive z axis. By rotating this ray about the z axis, all rays with the fixed value of # are obtained. Therefore, the coordinate surface #_ =00 is a circular cone whose axis is the z axis. For small values of #, the cone is a narrow cone about the positive z axis. The cone becomes wider as # increases so that it coincides with the xy plane when # =w /2. For # > 7/2, the cone lies below the xy plane, and it eventually collapses into the negative z axis as soon as # reaches the value 7. The algebraic equations of the coordinate surfaces follow from (14.19): P = Po x2 + y2 + z2 = PO (sphere), # = o# z = cot (#0o)/ x2+y2 (cone), S= Bo y cos Bo = x sin Bo (half-plane). So any point in space can be viewed as the point of intersection of three coordinate surfaces: the sphere, cone, and half-plane. Under the transformation (14.19), any region E is mapped onto a region E' in the space spanned by the ordered triples (p,#0, 0). If E is bounded by spheres, cones, and half-planes only, then its image E' is a rectangular box. Thus, a change of variables in a triple integral to spherical co- ordinates is advantageous when E is bounded by spheres, cones, and half-planes. EXAMPLE 14.26. Let E be the portion of the solid bounded by the sphere x2-+Hy2-+z2 = 4 and the cone z2 = 3(x2 + y2) that lies in the first octant. Find the region E' that is mapped onto E by the transformation (p, , 0) - (X, y, z). SOLUTION: The region E has four boundaries: the sphere, the cone z = v/13/2 + y2, the xz plane (x ;> 0), and the yz plane (y > 0). These boundaries are the images of p = 2, cot # = 3v/5 or # =w /3, 0 = 0, and w= /2, respectively. So E' is the rectangular box [0, 2] x [0, 7/3] x [0, 7/2]. The region E is intersected by all spheres with radii 0 < p 2, all cones with angles 0 < # 7/2 o FIGURE 14.28. Coordinate surfaces of spherical coordi- nates: spheres p = po, circular cones # = #0, and half-planes 0 = 0o bounded by the z axis. In particular, # = 0 and # = 7 describe the positive and negative z axes, respectively, and the cone with the angle # =/2 becomes the xy plane. de d# d# dA dp d0 FIGURE 14.29. Left: The base of a partition element in spherical coordinates is a portion of a sphere of radius p cut out by two cones with the angles # and # + d# = # + A# and by two half-planes with the angles 0 and 0 + dO = 0 + AO. Its area is dA = (pd#) - (r d0) = psin#0 dod0. Right: A partition element has the height dp = Ap as it lies be- tween two spheres whose radii differ by dp. So its volume is dV = d Adp = p2 sin 0 dp d d = J dV', and the Jacobian of spherical coordinates is J = p2 sin #.  104. TRIPLE INTEGRALS IN SPHERICAL COORDINATES 377 mapped onto a region E under the transformation (14.18). Consider a rectangular partition of E' by equispaced planes p = p2, # = # , and 0 = 0k such that pi+1 - p2 =Ap, #j+1 - #3 = i#, and 0k+1 - Ok = O8, where Ap, 0#, and AO are small numbers that can be regarded as differentials (or infinitesimal variations) of the spherical coordinates. Each partition element has volume AV' =Ap A# AO. The rectan- gular partition of E' induces a partition of E by spheres, cones, and half-planes. Each partition element is bounded by two spheres whose radii differ by Ap, by two cones whose angles differ by A#, and by two half-planes the angle between which is AO as shown in Figure 14.29. The volume of any such partition element can be written as AV = J AV' because only terms linear in the variations Ap = dp, A#o= d#, and AO = dB have to be retained. The value of J depends on a parti- tion element (e.g., partition elements closer to the origin should have smaller volumes by the geometry of the partition). The function J is the Jacobian for spherical coordinates. By means of (14.18), an integrable function fv(x, y, z) can be writ- ten in spherical coordinates. According to (14.13), in the three-variable limit (Ap, 0#, AO) -- (0, 0, 0), the Riemann sum for f for the partition constructed converges to a triple integral of f J expressed in the vari- ables (p, #, 0) over the region E' and thereby defines the triple integral of f over E in spherical coordinates. To find J, consider the image of the rectangular box p E [po, po + Ap], 0 E [ o, do + A], 0 E [0, 0o + AO] under the transformation (14.18). Since it lies between two spheres of radii po and po + Ap, its volume can be written as AV =Ap AA, where AA is the area of the portion of the sphere of radius po that lies between two cones and two half-planes. Any half-plane 0= o intersects the sphere p = po along a half-circle of radius po. The are length of the portion of this circle that lies between the two cones # =_0 and # =_0 + A# is therefore Oa = po A#. The cone # =_0 intersects the sphere p = po along a circle of radius ro = po sin #0 (see the text above (14.18)). Hence, the are length of the portion of this circle of intersection that lies between the half-planes 0= o and 0 = Bo + AO is Ab = ro AB = po sin # A0. The area AA can be approximated by the area of a rectangle with adjacent sides Aa and Ab. Since only terms linear in A#i and AO are to be retained, one can write AA =Aa Ab =po sin ~io A#i A0. Thus, the volume transformation law reads dV = J dV' , J = p2 sin#0.  378 14. MULTIPLE INTEGRALS In a Euclidean space spanned by ordered triples (p, #, 0), the Jacobian vanishes on the "planes" p = 0, # = 0, and # =wx. The transformation (14.18) is not one-to-one on them. All points (0, #, 0) are mapped to a single point (0, 0, 0) by (14.18), and all points (p, 0, 0) and (p, 7, 0) are mapped onto the z axis, that is, the line (0, 0, z), -oo < z < 00. By the continuity of the Jacobian, the difference between the values of J at any two sample points in a partition rectangle in E' vanishes in the limit (Ap, 0#, AO) - (0, 0, 0); that is, the value of the Jacobian in AV = J AV' can be taken at any point within the partition ele- ment when evaluating the limit of a Riemann sum. Therefore, for any choice of sample points, the limit of the Riemann sum (14.13) for the constructed partition is f (x, y, z) dV f (p sin # cos 0, p sin # sinOB, p cos #) p2 sin0#dV'. This relation defines the triple integral of f over E in spherical coor- dinates. The triple integral over E' has to be evaluated by converting it to a suitable iterated integral. EXAMPLE 14.27. Find the volume of the solid E bounded by the sphere x2 + y2 + z2 = 2z and the cone z = x2 + y2. SOLUTION: By completing the squares, the equation x2+ y2 + z2 = 2z is written in the standard form x2+ y2 + (z - 1)2 = 1, which describes a sphere of unit radius centered at (0, 0, 1). So E is bounded from the top by this sphere, while the bottom boundary of E is the cone, and E has no other boundaries. In spherical coordinates, the top boundary becomes p2 = 2p cos # or p = 2 cos #. The bottom boundary is # 7/4. The solid is shown in Figure 14.30. The boundaries of E impose no restriction on 0, which can therefore be taken over its full range. Hence, the region E' whose image is E admits the following algebraic description: E' =_{(p,i#,O)|0 p 2cos#, (#,OB) E [0,w7/4] x [0, 27]}. Since the range of p depends on the other variables, the integration with respect to it must be carried out first when converting the triple integral over E' into an iterated integral (B' is p simple, and the projection of E' onto the #O plane is the rectangle [0, w/4] x [0, 2w]). The order in which the integration with respect to 0 and #/ is carried out is irrelevant  104. TRIPLE INTEGRALS IN SPHERICAL COORDINATES 379 p = 2 cos# #=7x/4 0 X FIGURE 14.30. An illustration to Example 14.27. Any ray in space is defined as the intersection of a cone with an angle # and a half-plane with an angle 0. To find E' whose image is the depicted solid E, note that any such ray intersects E along a single straight line segment if 0 < # < 7/4, where the cone # = #/4 is a part of the boundary of E. Due to the axial symmetry of E, there is no restriction on the range of 0, that is, 0 < 0 < 27 in E'. The range of p is determined by the length of the segment of intersection of the ray at fixed # and 0 with E: 0 < p < 2 cos #, where p = 2 cos # is the equation of the top boundary of E in spherical coordinates. because the angular variables range over a rectangle. One has V (E) f dV ffp2 sin $ dV' f sin$2 cosp2dpdo d 0 0 0 8 27 I /4 167 [1 d8 cos3 Obsin o do= 3 0 03 1/ " u3 du 7r, where the change of variables u = cos $ has been carried out in the last integral. Q 104.5. Exercises. (1) Sketch the solid E onto which the specified region E' is mapped by the transformation (r, 0, z) -- (x, y, z): (i) 0 r 3, 7/4 w/4, 0 < z < 1 (ii) 0 r<1,0<0<2,r-1O0. (6) Evaluate the triple integral by converting it to spherical coordi- nates: (i) fffE(x2 + y2 + z2)3 dV, where E is the ball of radius a centered at the origin (ii) fffE y2 dV, where E is bounded by the yz plane and the hemi- spheres x = 1 - y2 - z2 andcx= z= 4-y2 - z2 (iii) fffE cyz dV, where E is enclosed by the cone z = v/53z2 + y2 and the spheres x2 +y2 +z2 =a2, a= 1, 2 (iv) fffE z dV, where E is the part of the ball x22+ y2 + z2 < 1 that lies below the cone z = 3x2 + 3y2 (v) fffE z dV, where Elies in the first octant between the planes y = 0 and = v/3y and is also bounded by the surfaces z= =vz2+y2 andcx2+ y2+ z2 4 (vi) fffE cv2 + y2 + z2 dV, where E is bounded by the sphere x22+ y2 + z2 = z (7) Sketch the region of integration, write the triple integral in spherical coordinates, and then evaluate it: (i) fof lx2+fl zdz dy dz (ii) f 10f1-x2f2-x2-y2 z2 dz dy dz 0 0 x2+y2 (8) Sketch the solid whose volume is given by the iterated integral in the spherical coordinates: f/2 f07402/ cos4 p2 s d # o'p sin O dp dOd Write the integral in the cylindrical coordinates and then compute it. (9) Sketch the domain of integration, write the triple integral in cylin- drical coordinates, and then evaluate it: S f 1_ 2 fo2Y2z dz dy dc (10) Convert the triple integral fffE f(cv2 + y2 + z2) dV to iterated in- tegrals in cylindrical and spherical coordinates if E is bounded by the surfaces: (i) z = X2 + y2, y = X, z = 1, yJ = 0, z = 0 (ii) z2= c2+y2,c2+y2+z2=2z,cv=y/v/5,vz = ,where v ;> 0 and y ;> 0 (11) Use spherical coordinates to find the volume of a solid bounded by the surfaces: (i) c2+ y2+ z2 =a2,c2+y2+z2 =b2, z cv22+ y2  382 14. MULTIPLE INTEGRALS (ii) (x2 + y2 + z2)3 = a6z2/(x2 + y2) (iii) (x2 + y2 + z2)2 = a2(x2 + y2 - z2) (iv) (x2 + y2 + z2)3 = 3xyz (12) Find the volume of a solid bounded by the surfaces x2+ z2 = a2, x2 + z2 = b2, x2 + y2 = z2, where x> 0. 105. Change of Variables in Triple Integrals Consider a transformation of an open region E' in space into a re- gion E defined by x = x(u, v, w), y =_y(u, v, z), and z = z(u, v, w); that is, for every point (u, v, w) E E', these functions define an im- age point (x, y, z) E E. If no two points in E' have the same image point, the transformation is one-to-one, and there is a one-to-one cor- respondence between points of E and E'. The inverse transformation exists and is defined by the functions ut= u(x, y, z), o = v(x, y, z), and w = w(x, y, z). Suppose that these functions have continuous partial derivatives so that the gradient of these functions does not van- ish. Then, as shown in Section 103.1, the equations u(x, y, z) = uo, v(x, y, z) = vo, and w(x, y, z) = wo define smooth surfaces, called co- ordinate surfaces of the new variables. A point (zo, Yo, zo) = ro is the intersection point of three coordinate planes x = zo, y =_Yo, and z = zo. Alternatively, it can be viewed as the point of intersection of three co- ordinate surfaces, u(X',y, z) = uo, v(x, y, z) = vo, and w(x, y, z) = wo, where the point (uo, vo, wo) in E' is mapped to ro by the coordinate transformation. DEFINITION 14.15. (Jacobian of a Transformation). Suppose that a one-to-one transformation of an open set E' onto E has continuous first-order partial derivatives. The quantity (x, y,z) _ 'y' z' = det Ixv yvzv (u, v,w)' y' z' is called the Jacobian of the transformation. If the determinant is expanded over the first column, then it can also be written as the triple product: &(ccy~ = Vz - (Vy x Vz). &(Li, v, w) The technical details are left to the reader as an exercise.  105. CHANGE OF VARIABLES IN TRIPLE INTEGRALS 383 DEFINITION 14.16. (Change of Variables). Let a transformation of an open set E' onto E have continuous partial derivatives. It is called a change of variables (or a change of coordi- nates) if its Jacobian does not vanish in E'. The inverse function theorem (Theorem 14.10) holds for a transfor- mation (u, v, w) - (x, y, z). If the Jacobian of the transformation does not vanish on E', then the inverse transformation (x, y, z) - (u, v, w) exists and has continuous partial derivatives. As in the case of double integrals, a change of variables in space can be used to simplify the evaluation of triple integrals. For exam- ple, if there is a change of variables whose coordinate surfaces form a boundary of the integration region E, then the new integration region E' is a rectangular box, and the limits in the corresponding iterated integral are greatly simplified in accordance with Fubini's theorem. 105.1. The Volume Transformation Law. It is convenient to introduce the following notations: (u, v, w) = r' and (x, y, z) = r so that the change of variables is written as (14.20) r = ((r'), y(r'), z(r')) or r' = (u(r), v(r), w(r)). Let E be a rectangular box in E', u E [uo, no + Au], v E [vo, vo + Av], and w E [wo, wo + Aw]. Under the transformation (u, v, w) - (x, y, z), its image E0 is bounded by smooth surfaces if the transformation is a change of variables. If the values of Au, Av, and Aw are infinitesimally small, that is, they can be viewed as differentials of the new variables, then the boundary surfaces of E0 can be well approximated by tangent planes to them, and the volume of E0 is then approximated by the volume of the polyhedron bounded by these planes. This implies, in particular, that when calculating the volume, only terms linear in Au, Av, and Aw are to be retained, while their higher powers are neglected. Therefore, the volumes of E0 and E must be proportional: AV = J AV', AV' = Au av Aw. The objective is to calculate J. By the examples of cylindrical and spherical coordinates, J is a function of the point (o, vo, wo) at which the rectangular box E6 is taken. The derivation of J is fully analogous to the two-variable case. An infinitesimal rectangular box E6 and its image under the coor- dinate transformation are shown in Figure 14.31. Let 0', A', B', and  384 14. MULTIPLE INTEGRALS AV' Ay .ez Au Av C' C C / r rer V w z r's ra ro a FIGURE 14.31. Left: A rectangular box in the region Eo with infinitesimal sides du = Au, dv = Av, dw = Aw so that its volume AV' = du dv dw. Right: The image of the rectangular box under a change of variables. The position vectors rp, where P = 0, a, b, c, are images of the position vectors r'. The volume AV of the image is approximated by the volume of the parallelepiped with adjacent sides OA, OB, and OC. It is computed by linearization of AV in du, dv, and dw so that AV = J du dv dw = J AV', where J > 0 is the Jacobian of the change of variables. C' have the coordinates, respectively, r'o = (uo, vo, w) , r' = (uo + Au, vo, wo) = ro +8 6Au, r' = (uo, vo + Av, wo) = ro +e2 Av, r' = (uo, vo, wo + Aw) = ro +e3 Aw, where 61,2,3 are unit vectors along the first, second, and third coordi- nate axes. In other words, the segments O'A', O'B', and O'C' are the adjacent sides of the rectangular box E . Let 0, A, B, and C be the images of O', A', B', and C' in the region E. Owing to the smoothness of the boundaries of Eo, the volume AV of Eo can be approximated by the volume of the parallelepiped with adjacent sides a = OA, b = OB, and c = OC. Then a = (x(r') - x(ro), y(r') - y(ro), z(r') - z(ro)) =(, y'u, z') Au, b = (x(r'l) - x(r'), y(r') - y(r'), z(r') - z(r')) = (', y', z') Av, c = (x(rc) - x(ro), y(r') - y(ro), z(rc) - z(ro)) =(, y', z') Aw,  105. CHANGE OF VARIABLES IN TRIPLE INTEGRALS 385 where all the differences have been linearized, for instance, x(r') - x(r') = x(r'o+8D Au) - x(r') x= z'(r') Au. Because of differentiability of the functions x(r'), y(r'), and z(r'), the error of this approximation decreases to 0 faster than Au, Av, Aw as the latter approach zero values. This justifies the approach based on retaining only terms lin- ear in Au, Av, Aw when calculating the volume. The volume of the parallelepiped is given by the absolute value of the triple product: (14.21) z's y' z' AV= = a - (b x c)| det , y'z' Au Av Aw= J AV', zx y' z,' where the derivatives are evaluated at (uo, vo, wo). The function J in (14.21) is the absolute value of the Jacobian. The first-order partial derivatives are continuous for a change of variables and so are the Jacobian and its absolute value. If the Jacobian of the transformation does not vanish, then by the inverse function theorem (Theorem 14.10) there exists an inverse transformation, and, similarly to the two-dimensional case (compare with (14.11)), it can be proved that (XQyz) 1 ( y - (14.22) J - ('' - w - det v' v' v' 3(u, v, w) 0(u, v, w) w 'w (x,vy, z) X Y 2 -1 = Vu-(VvxVw) . This expression defines J as a function of the old variables (x, y, z). 105.2. Triple Integral in Curvilinear Coordinates. Consider a partition of E' by equispaced planes u = u2, v v= o, and w =Wk: u±i+1 - Ui = Du, v+l - v, =Av, and Wk+1 - Wk =Aw. The indices (i, j, k) enumerate planes that intersect E'. This rectangular partition of E' corresponds to a partition of E by the coordinate surfaces u(r) = u , v(r) v= o, and w(r) =Wk. If E'Jk is the rectangular box u E [u2, ui+1], v E [vy, vj+1, and w E [Wk, wk+1], then its image, being the corresponding partition element of E, is denoted by Eigk. A Riemann sum can be constructed for this partition of £ (assuming, as before, that f is defined by zeros outside B). The triple integral of f over £ is the limit (14.13), which is understood as the three-variable limit (Au, Av, Aw) -~ (0, 0, 0). The volume AVigk of Eggk is related to the volume of the rectangle Elyk by (14.21). By the continuity of J, its value in (14.21) can be taken  386 14. MULTIPLE INTEGRALS at any sample point in Estk. According to the definition of the triple integral, the limit of the Riemann sum is the triple integral of f J over the region E'. The above qualitative consideration suggests that the following theorem holds. THEOREM 14.15. (Change of Variables in a Triple Integral). Let a transformation E' - E defined by functions (u, v, w) - (x, y, z) with continuous partial derivatves have a nonvanishing Jacobian, except perhaps on the boundary of E'. Suppose that f is continuous on E and E is bounded by piecewise-smooth surfaces. Then ff f (r) dV =ff f(x(r'), y(r'), z(r')) J(r') dV', , &(x, y,z) J(r') &=.(uvw) 8(, v, w) Evaluation of a triple integral in curvilinear coordinates follows the same steps as for a double integral in curvilinear coordinates. EXAMPLE 14.28. (Volume of an Ellipsoid). Find the volume of a solid region E bounded by an ellipsoid x2/a2 + y2/b2 + z2/c2 1. SoLUTIoN: The integration domain can be simplified by a scaling transformation x = an, y = by, and z = cw under which the ellipsoid is mapped onto a sphere of unit radius u2 + v2 + w2 = 1. The image E' of E is a ball of unit radius. The Jacobian of this transformation is a 0 0 J= det 0 b 0 = abc. 0 0 c Therefore, V(E) f= fdV = j J dV' = abc fff dV' 4wr = abcV (E') =4abc. 3 When a =b =c =R, the ellipsoid becomes a ball of radius R, and a familiar expression for the volume is recovered: V =(47r/3)R3. E XAMPLE 14.29. Let a, b, and c be non-coplanar vectors. Find the volume of a solid B bounded by the surface (a-r)32+(b-r)32+(c-r)2=R2 where r =(xc, y, z).  105. CHANGE OF VARIABLES IN TRIPLE INTEGRALS 387 w A z E FIGURE 14.32. An illustration to Example 14.28. The el- lipsoidal region x2/a2 + y2/b2 + z2/c2 < 1 is mapped onto the ball u2 + v2 + w2 < 1 by the coordinate transformation u = x/a, v = y/b, w = z/c with the Jacobian J = abc. SOLUTION: =b . r, w (14.22): Define new variables by the transformation u = a . r, = c . r. The Jacobian of this transformation is obtained by (x,y,z) (u, v, w) 0(u, v, w) ((x, y, z) 1 --1 (Vu . (wv x Vw)) (a . (b x c)) 1 The vectors a, b, and c are non-coplanar, and hence their triple product is not 0. So the transformation, is a genuine change of variables. Under this transformation, the boundary of E becomes a sphere u2+v2+w2 R2. So V (E) J dV V(E') a .(b x c) J dV' = 1 dV' JJEa.(bxc) 111F 47R3 3|a. (b x c)' where V(E') = 47R3/3 is the volume of a ball of radius R. Q 105.3. Study Problems. Problem 14.8. (Volume of a Tetrahedron). A tetrahedron is a solid with four vertices and four triangular faces. Let the vectors a, b, and c be three adjacent sides of the tetrahedron. Find its volume.  388 14. MULTIPLE INTEGRALS w z q E £ cc' bC FIGURE 14.33. An illustration to Study Problem 14.8. A general tetrahedron is transformed to a tetrahedron whose faces lie in the coordinate planes by a change of variables. SOLUTION: Consider first a tetrahedron whose adjacent sides are along the coordinate axes and have the same length q. From the geometry, it is clear that six such tetrahedrons form a cube of volume q3. Therefore, the volume of each tetrahedron is q3/6 (if so desired this can also be established by evaluating the corresponding triple integral; this is left to the reader). The idea is to make a change of variables such that a generic tetrahedron is mapped onto a tetrahedron whose adjacent faces lie in the three coordinate planes. The adjacent faces are portions of the planes through the origin. The face containing vectors a and b is perpendicular to vector n = a x b so the equation of this boundary is n - r = 0. The other adjacent faces are similar: n-r=0 or n1x+n2y+n3z=0, n=axb, 1-r=0 or l1x+l2y+l3z=0, 1cxa, m-r=0 or m1x+m2y+ m3z=0, m bxc, where r = (x, y, z). So, by putting u = m - r, v= 1 - r, and w = n - r, the images of these planes become the coordinate planes, w = 0, v= 0, and u = 0. A linear equation in the old variables becomes a linear equation in the new variables under a linear transformation. Therefore, an image of a plane is a plane. So the fourth boundary of E' is a plane through the points a', b', and c', which are the images of r =a, r =b, and r =c, respectively. One has a' =u(a), v(a), w(a)) =(q, 0, 0), where q =a - m =a - (b x c) because a - n =0 and a - 1 =0 by the geometrical properties of the cross product. Similarly, b' =(0, q, 0)  105. CHANGE OF VARIABLES IN TRIPLE INTEGRALS 389 and c' = (0, 0, q). Thus, the volume of the image region E' is V(E') |ql3/6 (the absolute value is needed because the triple product can be negative). To find the volume V(E), the Jacobian of the transformation has to be found. It is convenient to use the representation (14.22): mi m2 m3 1 J= det jii n2 3 =. (ni n2 n3 Therefore, V(E)=fff dV=fff JdV' Jfff dV' JV(E')= q The volume V(E) is independent of the orientation of the coordinate axes. It is convenient to direct the x axis along the vector a. The y axis is directed so that b is in the xy plane. With this choice, a = (ai, 0, 0), b = (bi, b2, 0), and c = (ci, c2, c3). A straightforward calculation shows that q = a1b2c3 and J = (albcs)1. Hence, V(E) = laib2c3|/6. Finally, note that c31= h is the height of the tetrahedron, that is, the distance from a vertex c to the opposite face (to the xy plane). The area of that face is A = ||a x bl|/2= = aib2|/2. Thus, 1 V(E) = -hA; 3 that is, the volume of a tetrahedron is one-third the distance from a vertex to the opposite face, times the area of that face. Q 105.4. Exercises. (1) Find the Jacobian of the following transformations: (i) x =L/v, y = v/w, z = w/u (ii) x = v + w2, y = w + u2, z = u + v2 (iii) x = uv cos w, y = uv sin w, z = (L2 - v2)/2 (these coordinates are called parabolic coordinates) (iv) x + y + z = u, y + z = uv, z = uvw (2) Find the region E' whose image E under the transformation defined in exercise 1, part (iv), is bounded by the coordinate planes and by the plane x + y + z= 1. In particular, investigate the image of those points in E' at which the Jacobian of the transformation vanishes. (3) Let £ be the solid region in the first octant defined by the inequality xi + fy + z <; a, where a > 0. Find its volume using the triple integral in the new variables Li = , o = y, w =z. (4) Use a suitable change of variables in the triple integral to find the volume of a solid bounded by the surfaces:  390 14. MULTIPLE INTEGRALS (i) (x/a)2/3 + (y/b)2/3 + (z/c)2/3 I (ii) (x/a)1/3 + (y/b)1/3 + (z/c)1/3 = 1, where x > 0, y > 0, z > 0 (iii) x = 0, y= 0, z = 0, and (x/a)< + (y/b)m + (z/c)k = 1, where the numbers n, m, and k are positive (iv) (x + y + z)2 = ax + by, where (x, y, z) lie in the first octant and a and b are positive (v) (x + y)2 + z2 = R2, where (x, y, z) lie in the first octant (5) Evaluate the triple integral fffE z dV, where E lies above the cone z = c /x2/a2 + y2/b2 and is bounded from above by the ellipsoid x2/a2 + y2/b2 + z2/c2 1. (6) Evaluate the triple integral fffE(4x2 - 922) dV, where E is enclosed by the paraboloid z =/x29 + y2/4 and the plane z = 10. (7) Consider a linear transformation of the coordinates x = a -r', y = b - r', z = c - r', where r' = (u, v, w) and the vectors a, b, and c have constant components. Show that this transformation is volume preserving if la - (b x c) 1. The transformation is said to be volume preserving if the image E of any E' has the same volume as E', that is, V(E') = V(E). (8) If a, b, and c are constant vectors, r = (x, y, z), and E is given by the inequalities 0 < a - r < a, 0 < b -r < 3, and 0 < c - r ,, show that fffE(a r)(b. r)(c" r) dV $a/3'7)2/ a - (b x c). (9) Consider parabolic coordinates x = uv cos w, y = uv sin w, and z = (22-v2). Show that 2z = (x2+y2)/v2-v2, 2z = -(X2+y2)2+u2 and tan w = y/x. Use these relations to sketch the coordinate surfaces u(x, y, z) = uo, v(x, y, z) = vo, and w(x, y, z) = wo. Evaluate the triple integral of f (x, y, z) = zyz over the region E that lies in the first octant beneath the paraboloid 2z - 1 = -(x2 + y2) and above the paraboloid 2z+1 = x2+ v2 by converting to parabolic coordinates. (10) Use a suitable change of variables to find the volume of a solid that is bounded by the surface x2 2 n 2n zx 2 2 n-2 a2+b2} + h a2+b2 ) 12n>1. (11) (Generalized Spherical Coordinates) Generalized spherical coordi- nates (p, #, 0) are defined by the equations x =ap sin" # cos"mO6, y =bp sin" # sin"mO6, z =cp cos" #, where 0 p < 00, 0 < 0 < 2wr 0 # wr, and a, b, c, n, and in are parameters. Find the Jacobian of the generalized spherical coordinates.  105. CHANGE OF VARIABLES IN TRIPLE INTEGRALS 391 (12) Use generalized spherical coordinates with a suitable choice of parameters to find the volume of a solid bounded by the surfaces: (i) [(x/a)2 + (y/b)2 + (z/c)2]2 =-(x/a)2 + (y/b)2 (ii) [(x/a)2 + (y/b)2 + (z/c)2]2 =-(x/a)2 + (y/b)2 - (z/c)2 (iii) (/a)2 + (y/b)2 + (z/c)4 = 1 (iv) [(x/a)2 + (y/b)2]2 + (z/c)4 = 1 (13) (Dirichlet's Integral) Let n, m, p, and s be positive integers. Use the transformation defined by x + y + z = u, y + z = uv, z = uvw to show that /// n!rm!p! s! ffxhY"zP(1 - x - y - z)sdV EI(mHE+p+ s + 3)! where E is the tetrahedron bounded by the coordinate planes and the plane x + y + z = 1. (14) (Orthogonal Curvilinear Coordinates) Curvilinear coordinates (u, v, w) are called orthogonal if the normals to their coordinate surfaces are mutually orthogonal at any point of their intersection. In other words, the gradients Vu(x, y, x), Vv(x, y, z), and Vw(x, y, z) are mutually or- thogonal. One can define unit vectors orthogonal to the coordinate surfaces: Vu Vv Vw (14.23) evV=we = ew = . Note that the Jacobian of a change of variables does not vanish and the relation (14.22) guarantees that these unit vectors are not coplanar and form a basis in space (any vector can be uniquely expanded into a linear combination of them). (i) Show that (14.24)| 1, ||VO||= - , ||Vz|| = 1, r 1 1 (14.25) ||Vp|| = 1, ||V|I|S= - |VO =psin for the cylindrical (r, 0, z) and spherical (p, #, 0) coordinates. (ii) Show that the spherical and cylindrical coordinates are orthog- onal coordinates and, in particular, (14.26) er = (cos,sin,0), eo=(-sinB, cos8,0), ez = (0,0,1) for the cylindrical coordinates, and er=(sin # cos 0, sin # sin 0, cos#) (14.27) eg=(cos #cos 0, cos #sinO, - sin#), e- (- sin0, cos,O0) for the spherical coordinates.  392 14. MULTIPLE INTEGRALS 106. Improper Multiple Integrals In the case of one-variable integration, improper integrals occur when the integrand is not defined at a boundary point of the integration interval or the integration interval is not bounded. For example, /1dx f dx. 1 - a-"- 1 (42)= lim - = lim =vyC 1, in -0 a a a-0 1 - v 1 - or /* 1 f4 1 dz = lim dz = lim t an- a = - . J2oo 1+2 a-+2o 2 Improper multiple integrals are quite common in many practical appli- cations. 106.1. Multiple Integrals of Unbounded Functions. Suppose a function f(r) is not defined at a point ro that is a limit point of the domain of f (any neighborhood of ro contains points of the domain of f). Here r = (x, y, z) E E or r = (x, y) E D. For definiteness, the three- dimensional case is considered, while the two-dimensional case can be treated analogously. If, in any small ball B, of radius E centered at r0, the values of |f(r)| are not bounded, then the function f is said to be singular at r0. If a closed bounded region E contains singular points of a function f, then the upper and lower sums cannot be defined because, for partition rectangles containing a singular point, sup f or inf f or both do not exist, and neither is defined a multiple integral of f. Let B, be an open ball of radius E centered at a point ro. Suppose that the function f is singular at ro. Define the region E by removing all points of E that also lie in a ball B,. Suppose that f is integrable on E for any Ec> 0 (e.g., it is continuous). Then, by analogy with the one-variable case, a multiple integral of f over E is defined as the limit (14.29) f dV = lim f dV or f dA = lim f dA, provided, of course, the limit exists. If f is singular in a point set S, then one can construct a set SS that is the union of balls of radius E centered at each point of S. Then E is obtained by removing SS from E. The regularization procedure in two dimensions is illustrated in Figure 14.34. Although this definition seems a rather natural generalization of the one-variable case, there are subtleties that are specific to multivariable integrals. This is illustrated by the following example. Suppose that (14.30) f (x, y) = (xc2 + y2)2  106. IMPROPER MULTIPLE INTEGRALS 393 FIGURE 14.34. A regularization of an improper integral. Left: B, is a ball centered at a singular point of the inte- grand. BD is the intersection of B, with D. The integration is carried out over the region D with B. removed. Then the limit c - 0 is taken. Middle: The same regulariza- tion procedure when the singular point is an interior point of D. Right: A regularization procedure when singular points form a curve S. By removing the set S, from D, the region D, is obtained. The distance between any point of D, and the set S is no less than E. is to be integrated over the sector 0 < 0 < 00 of a disk x2 + y2 < 1, where 0 is the polar angle. If the definition (14.29) is applied, then D, is the portion of the ring 2 < x2 + y2 < 1 corresponding to 0 < 0 < B0. Then, by evaluating the integral in polar coordinates, one finds that // y2 - x2 oo 1 dr 1 jdA = - cos(2) doj - sin(20o) In. The limit E -- 0 does not exist for all 00 such that sin(200) $ 0, whereas the integral vanishes if 00 = k7/2, k = 1, 2, 3, 4, for any E > 0. Let 00 = 7/2. The integral vanishes because of symmetry, (x, y) -- (y, x), f(y, x) = -f(x, y), while the integration region is invariant under this transformation. The integrand is positive in the part of the domain where x2< y2 and negative if y2 > x2, and there is a mutual cancel- lation of contributions from these regions. If the improper integral of the absolute value |f(x, y) is considered, then no such cancellation can occur, and the improper integral always diverges. Furthermore, if D in the above example is the sector x2 + y2 1 x > 0, y > 0 or, in polar coordinates, 0 < r < 1, 0 < 0 < 7/2, the improper integral could also be regularized by reducing the integration region to E < r < 1, 0 < 0 < 00 with the subsequent two-variable limit ( , 00) - (0, 7/2). Evidently, this limit does not exist. Recall that even though the two-variable limit does not exist, the limit along a particular curve may still exist. For example, if the limit 00 -+ 7/2 is taken first, then the limit is 0, whereas the limit is infinite if the limit  394 14. MULTIPLE INTEGRALS E - 0 is taken first. This observation suggests that the value of the improper integral may depend on the way a regularization is introduced. Integrability of Unbounded Functions. Let E be a region in space (pos- sibly unbounded). An exhaustion of E is a sequence of bounded simple regions Ek, k = 1, 2, ..., such that E1 c E2 c - - £C E and the union of all Ek coincides with E. If the function f defined on E is singular at a limit point ro of E, then one can construct an exhaustion of E such that none of Ek contain ro. For example, one can take Ek to be the regions obtained from E by removing balls centered at ro of radii E = 1/k. It can be proved that the sequence of volumes V(Ek) con- verges to the volume of E. Owing to the observation that the value of an improper integral may depend on the regularization, the following definition is adopted. DEFINITION 14.17. (Integrability of an Unbounded Function). Let Ek be an exhaustion of E. Suppose that a function f on E is integrable on each Ek. Then the function f is integrable on E if the limit limk, fifE f dV exists and is independent of the choice of Ek. The value of the limit is called an improper integral of f over E. An improper double integral is defined in the same way. The con- dition that the limit should not depend on the choice of an exhaustion means that the improper integral should not depend on its regulariza- tion. According to this definition, the function (14.30) is not integrable on any region containing the origin because the limit depends on the way the regularization is imposed. Although Definition 14.17 elimi- nates a potential ambiguity of the relation (14.29) noted above, it is rather difficult to use. A simplification useful in practice is achieved with the help of the concept of absolute integrability. THEOREM 14.16. Let Ek and E be two exhaustions of E. Let f be a function on E such that |f| is integrable on each Ek and each E. Then lim IfdV limfff |f dV, k JoJJE k-ooJJJE/ where the limit may be +oo. In other words, the value of the improper integral fifEIfIdV < c if it exists, is independent of the regularization. The same statement holds for double integrals. DEFINITION 14.18. (Absolute I ntegrability). If the improper integral of the absolute value |f| over B exists, then f is called absolutely integrable on B.  106. IMPROPER MULTIPLE INTEGRALS 395 THEOREM 14.17. (Sufficient Condition for Integrability). Let f be a continuous function on E. If f is absolutely integrable on E, then it is integrable on E. This theorem implies that if the limit (14.29) exists for the abso- lute value |f|, then the improper integral of a continuous function f exists and can be calculated by the rule (14.29). The latter comprises a practical way to treat improper integrals. EXAMPLE 14.30. Evaluate the triple integral of f(x, y, z) (x2 + y2 + z2)-1 over a ball of radius R centered at the origin if it exists. SOLUTION: The function is singular only at the origin and continuous elsewhere. Let the restricted region E lie between two spheres: 2 < X 2 + y2 + z2 < R2. Since |f l = f > 0 in E, the convergence of the integral over E as E - 0 also implies the absolute integrability of f and hence the existence of the improper integral (Theorem 14.17). By making use of the spherical coordinates, one obtains fff dV 27f7fR p2 sin 0 E c2 +2+ + 2 0101e 2 dpdo dO=4(R-c)--47R as E - 0. So the improper integral exists and equals 47R. Q The following theorem is useful to assess the integrability. THEOREM 14.18. (Absolute Integrability Test). If |f(r)| < g(r) for all r in E and g(r) is integrable on E, then f is absolutely integrable on E. EXAMPLE 14.31. Investigate the integrability of f(x, y) X/(x2 + y2)v/2, 1 > 0, on a bounded region D. Find the integral, if it exists, over D that is the part of the disk of unit radius in the first quadrant. SOLUTION: The function is singular at the origin. Since f is continuous everywhere except the origin, it is sufficient to investigate the integra- bility on a disk centered at the origin. Put r = /2 + y2 (the polar radial coordinate). Then Iz| < r and hence |f l r/r" = rl-" = g. In the polar coordinates, the improper integral (14.29) of g over a disk of unit radius is /fdBfg(r)r dr 2wf r2-" d =27re{v  396 14. MULTIPLE INTEGRALS The limit c - 0 is finite if v < 3. By the integrability test (Theorem 14.18), the function f is absolutely integrable if v < 3. For v < 3 and D being the part of the unit disk in the first quadrant, one infers that r/2 1 1 //f//rrcosO lim f dA =limr dr dO lim r2-'dr =1. 0 E The two examples studied exhibit a common feature of how the function should change with the distance from the point of singularity in order to be integrable. THEOREM 14.19. Let a function f be continuous on a bounded re- gion D of a Euclidean space and let f be singular at a limit point ro of D. Suppose that |ff(r) <; Mr - ro||-v for all r in D such that 0 < ||r - ro|| < R for some R > 0 and M > 0. Then f is absolutely integrable on D if v ab-0b(x2 _ 2 2 a->0O a b->0 b Oy x2 _+ 1 lim lim( - 2 dx a->O a b->0 1+x2 x2+b 1 dx 1 dx = lim = = - a-0 I 1+x2 01+x2 4 Here (a, b) = (Ax, Ay). Alternatively, the limit Ax -- 0 can be taken first and then Ay - 0, which results in the iterated integral in the reverse order:  398 14. MULTIPLE INTEGRALS 398 14 M LTP E 2_GR L lim lim d dy = -lim limx dy d b-0b a 0 a (X2+y2 Y2 - b-0 J ba0 a 2 =-m l im (1 -2a dy b-0 a b a-0 1 y22+a2 1 dy b-0o ib +y2 This shows that the limit of the Riemann sum as a function of two variables Ax and Ay does not exist because it depends on a path along which the limit point is approached (the function is not integrable). 106.3. Multiple Integrals Over Unbounded Regions. The treatment of multiple integrals over unbounded regions follows the same steps intro- duced when discussing the integrability of unbounded functions. DEFINITION 14.19. Let E be an unbounded region and let Ek be an exhaustion of E where each Ek is bounded. Suppose that f is integrable on each Ek. Then Jl fdV =limIJIfk f dV if the limit exists and is independent of the choice of Ek. Double integrals over unbounded regions are defined in the same way. Theorems 14.16, 14.17, and 14.18 hold for unbounded regions. The following practical approach may be used to evaluate improper integrals over unbounded regions. Let DR be the intersection of D with a disk of radius R centered at the origin and let ER be the intersection of E with a ball of radius R centered at the origin. Let f be a continuous function that is absolutely integrable on E. The integral of f over D (or E) is evaluated by the rule Jilf(r)dA= lim f(r)dA or J ljf(r)dV = lim Jfj f(r) dV The absolute integrability of f means that these limits exist and are finite for the absolute value |fl. The asymptotic behavior of a function sufficient for absolute integrability on an unbounded region is stated in the following theorem, which is an analog of Theorem 14.19.  106. IMPROPER MULTIPLE INTEGRALS 399 THEOREM 14.20. Suppose f is a continuous function on an un- bounded region D of a Euclidean space such that |f(r)| M||r|'- for all ||r| ;> R in D and some R > 0 and M;> 0. Then f is absolutely integrable on D if v > n, where n is the dimension of the space. PROOF. Let R > 0. Consider the following one-dimensional improper integral: f/o dx JR Xv l a dx lim - a-mo JR X a lim o1 - v R R1-v a1-v + lim 1-v a-o 1l-iv if v- 1. The limit is finite if v > 1. When v = 1, the integral diverges as ln a. Let DR be the part of D that lies outside the ball BR of radius R and let BB be the part of the space outside BR (see Figure 14.36, left panel). Note that BR includes Di. In the two-variable case, the use of the polar coordinates gives ff -1 or v > 2. The case of triple integrals is proved similarly by means of the spherical coordinates. The volume element is dV = p2 sin # dp d# dO. The integration over the spherical angles yields the factor 47 as 0 < # < and 0 < 0 < 27 for the region Bh so that JJffdV |JJffdV which converges if v > 3. R pp 47M J00 P RwMJpv-2 D- EXAMPLE 14.32. Evaluate the double exp(-x2 -y2) over the entire plane. integral of f (x,y) SOLUTION: In polar coordinates, f = e . So, as r - oc, f decreases faster than any inverse power r-", n > 0, and by virtue of Theorem 14.20, f is absolutely integrable on the plane. By making use  400 14. MULTIPLE INTEGRALS 400 14. MULTIPLE INTEGRALS Cz C singular points Y C X FIGURE 14.36. Left: An unbounded region D is split into two parts: DR lies inside the ball BR of radius R, and D' is the part of D that lies outside the ball BR. The region BR is the entire space with the ball BR removed. The region D' is contained in BR. Right: A regularization procedure for the integral in Study Problem 14.9. The integration region E contains singular points along the z axis. The integral is regularized by removing the ball p < c and the solid cone # < c from E. After the evaluation of the integral, the limit c - 0 is taken. of the polar coordinates, ID C X2-Y2 dA 27 R lim e R--+°° 0 0 r2r dr dO -R2 7r lim Rowj0 e-" du r lim (1 - e-R2) R-oo where the substitution u = r2 has been made. Q It is interesting to observe the following. As the function is abso- lutely integrable, the double integral can also be evaluated by Fubini's theorem in rectangular coordinates: e - X - Y d A JD 1-0 X2 dx ]6Y2 dy I2 I fceX 2 d1 V17  106. IMPROPER MULTIPLE INTEGRALS 401 because I2 = r by the value of the double integral. A direct evaluation of I by means of the fundamental theorem of calculus is problematic as an antiderivative of e-x cannot be expressed in elementary functions. 106.4. Study Problem. Problem 14.9. Evaluate the triple integral of f(x, y, z) (X2 + y2)-1/2(x2 + y2 + z2)-1/2 over E, which is bounded by the cone z = yx2+y2 and the sphere x2 +y2 + z2 = 1 if it exists. SOLUTION: The function is singular at all points on the z axis. Con- sider E obtained from E by eliminating from the latter a solid cone # < e and a ball p < F, where p and # are spherical coordinates. To investigate the integrability, consider |f dV = f dV in the spherical coordinates: f dV = (p2 sin #)-1p2 sin # dp d#3 dO= dp d# dO, which is regular. So the function f is integrable as the image E' of E in the spherical coordinates is a rectangle (i.e., it is bounded). Hence, lim f dV = lim dp d# dO = 2dB /d# dp = . So the Jacobian cancels out all the singularities of the function. D 106.5. Exercises. (1) Let the function g(x, y) be bounded so that 0 < m < g(x, y) < M for all (x, y). Investigate the convergence of the following double integrals: (i) ffD g(x, y)(Xz22)-1 dA, where D is defined by the conditions |y x2, x2 + y2 < 1 (ii) ffD g(x, y)(I -+ + q)-1 dA, p > 0, q > 0, where D is defined by the condition |zl + ly| < 1 (iii) ff g( , y)(1 -x - y2)-pdA, where D is defined by the con- dition x2 + y2 < 1 (iv) ffD g(x, y)|cc - yl- dA, where D is the square [0, a] x [0, a] (v) ffD e-(x+y) dA, where D is defined by 0 < x y (2) Let the function g(x, y, z) be bounded so that 0 -' dV, where £ is defined by xc2 + y2 + z2> (ii) fiff gc, y z)(c2 + y2 -+ z2>-' dV, where £ is defined by xc2 + y2+z  402 14. MULTIPLE INTEGRALS 402 14. MULTIPLE INTEGRALS (iii) fff g(§,y,z)(§P +lyl -+zs)-dV, where p, q, and s are positive numbers and E is defined by XI|+ ly|+ Iz| > 1 (iv) fffE g(x, y, z)|x + y - z|-dV, where E [-1,11 x [-1, 1] x [-1, 1] (3) Evaluate the improper integral if it exists. Use appropriate coor- dinates when needed. (i) fffE(x2+y2+z2)1/2(x2+y2)1/2 dV, where E is the region in the first octant bounded from above by the sphere x2+y2+z2 2z and from below by the cone z = v/3/z2 + y2 (ii) fffE z(x2+y2)-1/2dV, where E is in the first octant and bounded from above by the cone z = 2 - x/2 + y2 and from below by the paraboloid z = x2 + y2 (iii) fffE xy(x2 + y2-1 (X2 + y2 + z2)-1 dV, where E is the portion of the ball x2 + y2 + z2 < a2 above the plane z = 0 (iv) fffE ex2y2-22 (2 +y2+z2)-1/2 dV, where E is the entire space (v) ffD(x2 + y2)1/2 dA, where D lies between the two circles x2+ y2 = 4 and (x - 1)2 + y2 = 1in the first quadrant, x, y > 0 (vi) ffD ln( 2 + y2) dA, where D is the disk x2 + y2 < a2 (vii) fffE(x2 + y2 + z2) ln(c2 + y2 + z2) dV, where E is the ball Xc2+ y2 + z2 < a2 and v is real. Does the integral exist for all v? (viii) ffD(x2 + y2) ln(c2 + y2) dA, where D is defined by x2+ y2 > a2 > 0 and v is real. Does the integral exist for all v? (ix) fffE(x2 + y2 + z2< ln(c2 + y2 + z2) dV, where E is defined by x2+ y2 + z2 > a2 > 0 and v is real. Does the integral exist for all v? (x) ffD[(a - x)(xc - y)]1/2dA, where D is the triangle bounded by the lines y = 0, y = x, and xc= a (xi) ffD ln sin(x - y) dA, where D is bounded by the lines y = 0, y =c, and c =7 (xii) ffD(x2 + y2)-1 dA, where D is defined by x2 + y2 < x (xiii) fffE z-Py-z-S dV, where E= [0, 1] x [0, 1] x [0, 1] (xiv) fffE(x2+22_z2)z-3 dV, where E is defined by x22+y2+z2 ; 1 (xv) fffE(1-z2-y2-z2)-" dV, where E is defined by x2+y2+z2 G 1 (xvi) fff, ex22_2-2z2 dV, where E is the entire space (xvii) liD e- -2 sin(cc2 + y2) dA, where D is the entire plane (xviii) liD e-(x/a)2_Iy/b)2 dA, where D is the entire plane (xix) liD eax2+2bxy~cy2 dA, where a < 0, ac - 62> 0, and D is the entire plane (Hint: Find a rotation that transforms cc and y so  107. LINE INTEGRALS 403 107. LINE INTEGRALS 403 that in the new variables the bilinear term "xy" is absent in the exponential.) (xx) fffE e(x/a)2-(y/b)2-(z/c)2+ax+y+7z dV, where E is the entire space (4) Let n be an integer. Show that lim sin(x2 + y2) dA= r, D: Ix 1, y > 1, diverges, whereas the iterated integrals in both orders converge. (6) Show that the following improper integrals converge. Use the geo- metric series to show that their values are given by the specified con- vergent series: (i) lima-1- ffa(1-Xy-1 dA = L n, where Da = [0, a] x [0, a] (ii) lima 1- fffEa(1 - xyz)-1 dV =L 1 y, where Ea [0, a] x [0, a] x [0, a] 107. Line Integrals Consider a wire made of a nonhomogeneous material. The inhomo- geneity means that if one takes a small piece of the wire of length As at a point r, then its mass Am depends on the point r. It can therefore be characterized by a linear mass density (the mass per unit length at a point r): ojr) ~Am(r) a(r) = lim .mr As-0 As Suppose that the linear mass density is known as a function of r. What is the total mass of the wire that occupies a space curve C? If the curve C has a length L, then it can be partitioned into N small segments of length As = L/N. If r* is a sample point in the pth segment, then the total mass reads N M= im (r*) As, Now where the mass of the pth segment is approximated by Am, ~cr(r*) As and the limit is required because this approximation becomes exact only in the limit As - 0. The expression for M resembles the limit of  404 14. MULTIPLE INTEGRALS a Riemann sum and leads to the concept of a line integral of o along a curve C. 107.1. Line Integral of a Function. Let f be a bounded function in E and let C be a smooth (or piecewise-smooth) curve in E. Suppose C has a finite are length. Consider a partition of C by its N pieces C, of length As, p = 1, 2, ..., N, which is the are length of C, (it exists for a smooth curve!). Put m, = infC, f and M1V= supc, f; that is, m, is the largest lower bound of values of f for all r E C,, and M, is the smallest upper bound on the values of f for all r E Cp. The upper and lower sums are defined by U(f, N) =EN_1 M, Asp and L(f, N) _= N1 m, p.p DEFINITION 14.20. (Line Integral of a Function). The line integral of a function f along a piecewise-smooth curve C is f(r) ds = lim U(f, N) = lim L(f, N), iC NooNo provided the limits of the upper and lower sums exist and coincide. The limit is understood in the sense that max As3 -a 0 as N - oo (the partition element of the maximal length becomes smaller as N increases). The line integral can also be represented by the limit of a Riemann sum: N (r)ds = lim f (r*) As,= lim R(fN). JNcow N->oo If the line integral exists, it follows from the inequality m f (r) < M for all r E C, that L(f, N) < R(f, N) U(f, N), and by the squeeze principle the limit of the Riemann sum is independent of the choice of sample points r* (see the left panel of Figure 14.37). It is also interesting to establish a relation of the line integral with a triple (or double) integral. Suppose that f is integrable on a region that contains a smooth curve C. Let Ea be a neighborhood of C that is defined as the set of points whose distance (in the sense of Definition 11.14) to C cannot exceed a > 0. So, for a small enough, the cross section of E by a plane normal to C is a disk of radius a whose area is AA =wga2 (see the right panel of Figure 14.37). Then, in the limit a -n 0, (14.31) 2l fE a(r) dV - f(r) ds.  107. LINE INTEGRALS 405 C rp r* C ...--' ' rp+1 Ea .-' AV=AAAsp AA FIGURE 14.37. Left: A partition of a smooth curve C by segments of arc length Asp used in the definition of the line integral and its Riemann sum. Right: The region Ea is a neighborhood of a smooth curve C. It consists of points whose distance to C cannot exceed a > 0 (recall Definition 11.14). For a and Asp small enough, planes normal to C through the points rp partition Ea into elements whose vol- ume is AV = AA Asp, where AA = ra2 is the area of the cross section of Ea. This partition is used to establish the relation (14.31) between the triple and line integrals. In other words, line integrals can be viewed as the limiting case of triple integrals when two dimensions of the integration region become infinitesimally small. This follows from (14.13) by taking a partition of Ea by volume elements AV = A A Asp and sample points along the curve C in Ea. In the Riemann sum for the left side of (14.31), the factor AA = ra2 in AV cancels the same factor in the denominator so that the Riemann sum becomes a Riemann sum for the line integral on the right side of (14.31). In particular, it can be concluded that the line integral exists for any f that is continuous or has only a finite number of bounded jump discontinuities along C. Also, the line integral inherits all the properties of multiple integrals. The evaluation of a line integral is based on the following theorem. THEOREM 14.21. (Evaluation of a Line Integral). Suppose that f is continuous in a region that contains a smooth curve C. Let a vector function r(t), t c [a, b], trace out the curve C just once. Then (14.32) f (r) ds = f (r(t)) |r'(t) dt. C a PROOF. Consider a partition of [a, b], t, = a + p At, p = 0, 1, 2, ..., N, where At = (b - a)/N. It induces a partition of C by pieces Cp so that r(t) traces out Cp when t E [tp_1, tp], p = 1, 2, ..., N. The arc  406 14. MULTIPLE INTEGRALS length of C, is f ||r'(t)||dt = Asp. Since C is smooth, the tangent vector r'(t) is a continuous function and so is its length r'(t)||. By the integral mean value theorem, there is t* E [t_1, ty] such that As, ||r'(t*)||At. Since f is integrable along C, the limit of its Riemann sum is independent of the choice of sample points and a partition of C. Choose the sample points to be r* = r(t*). Therefore, N b f ds = jim (f (r(t* ))||r'(t* )|| At =f (r(t ))|r'(t )|dt. pC1 Note that the Riemann sum for the line integral becomes a Riemann sum of the function F(t) = f(r(t))|r'(t)|| over an interval t E [a, b]. Its limit exists by the continuity of F and equals the integral of F over [a,b]. D The conclusion of the theorem still holds if f has a finite number of bounded jump discontinuities and C is piecewise smooth. The lat- ter implies that the tangent vector may only have a finite number of discontinuities and so does ||r'(t)||. Therefore, F(t) has only a finite number of bounded jump discontinuities and hence is integrable. 107.2. Evaluation of a Line Integral. Step 1. Find the parametric equation of a curve C, r(t) (x(t), y(t), z(t)). Step 2. Restrict the range of the parameter t to an interval [a, b] so that r(t) traces out C only once when t E [a, b]. Step 3. Calculate the derivative r'(t) and its norm r'(t)||. Step 4. Substitute x= x(t), y = y(t), and z = z(t) into f(x, y, z) and evaluate the integral (14.32). Remark. A curve C may be traced out by different vector functions. The value of the line integral is independent of the choice of paramet- ric equations because its definition is given only in parameterization- invariant terms (the are length and values of the function on the curve). The integrals (14.32) written for two different parameterizations of C are related by a change of the integration variable (recall the concept of reparameterization of a spatial curve). EXAMPLE 14.33. Evaluate the line integral of f(x, y) = x2y over a circle of radius R centered at the point (0, a). SOLUTION: The equation of a circle of radius R centered at the origin is xc2 + y2 =R2. It has familiar parametric equations x = R cos t and y =R sin t, where t is the angle between r(t) and the positive cc axis counted counterclockwise. The equation of the circle in question is  107. LINE INTEGRALS 407 107. LINE INTEGRALS 407 x2 +(y-a)2 = R2. So, by analogy, one can put x = R cos t and y - a R sin t (by shifting the origin to the point (0, a)). The parametric equation of the circle can be taken in the form r(t) = (Rcost, a + R sin t). The range of t must be restricted to the interval t E [0, 27] so that r(t) traces the circle only once. Then r'(t) = (-Rsint, R cos t) and r'(t)| /R2 sin2 t + R2 cos2 t= R. Therefore, fc x2y ds/=i (R cos t)2(a+ R sin t)Rdt=R2aJ( cos2 t dt = xwR2a, where the integral of cos2 t sin t over [0, 2w] vanishes by the periodicity of the cosine function. The last integral is evaluated with the help of the double-angle formula cos2 t= (1 + cos(2t))/2. D EXAMPLE 14.34. Evaluate the line integral of f(x, y, z) 3x2 + 3y2 - z2 over the curve of intersection of the cylinder x2+y2 - 1 and the plane z + y + z =0. SoLUTIoN: Since the curve lies on the cylinder, one can always put x = cost, y = sint, and z = z(t), where z(t) is to be found from the condition that the curve also lies in the plane: x(t) + y(t) + z(t) = 0 or z(t) = - cos t - sin t. The interval of t is [0, 2w] as the curve winds about the cylinder. Therefore, r'(t) = (- sin t, cost, sin t - cost) and r'(t) 2 - 2 sin t cos t = V2 - sin(2t). The values of the function along the curve are f = V3 - (cos t + sin t)2 = 22 - sin(2t). Note that the function is defined only in the region 3(x2 + y2) > z2 (outside the double cone). It happens that the curve C lies in the domain of f because its values along C are well defined as 2 > sin(2t) for any t. Hence, ffds =] 2- sin(2t) 2 - sin(2t)dt =] (2 - sin(2t)) dt = 4w. 107.3. Exercises. (1) Evaluate the line integral: (i) f ccy2 ds, where C is the right half of the circle x2 + y2 4 (ii) fe cc sin y ds, where C is the line segment from (0, a) to (b, 0) (iii) fcccyz ds, where C is the helix xc= 2 cos t, y =t, z =-2 sint, 0,) AA. Since the derivatives of f are continuous, the function J(x, y) is con- tinuous on D, and the Riemann sum converges to the double integral of J over D.  410 14. MULTIPLE INTEGRALS z z = L(x, y) A A4 a *p S f * B f (x, Y) Ox .P* 2 D' Ay FIGURE 14.38. Left: The rectangle with adjacent sides O'A' and O'B' is an element of a rectangular partition of D and P' is a sample point. The point P is the point on the graph z = f (x, y) for (x, y) = P'. The linearization of f at P defines the tangent plane z = L(x, y) to the graph through P. The surface of the portion of the graph above the partition rectangle is approximated by the area of the portion of the tangent plane above the partition rectangle, which is the area of the pagr ram with adjacent sides OA and OB. It equals |a x bl|. Right: An illustration to Example 14.36. The part of the paraboloid whose area is to be evaluated is obtained by restricting (x, y) to the part D of the disk of radius 2 that lies in the first quadrant. DEFINITION 14.21. (Surface Area). Suppose that f (x,yg) has continuous first-order partial derivatives on D. Then the surface area of the graph z = f (x, y) is given by A(S) =1 + (f')+ (f'1) 2 dA. J DIf z = const, then f' = f' = 0 and A(S) = A(D) as required because S is D moved parallel into the plane z = const. E XAMPLE 14.35. Show that the surface area of a sphere of radius R is 4R2. SOLUTION: The hemisphere is the graph z = f(x, y) = R2 - -2 on the disk x2 + y2 < R2 of radius R. The area of the sphere is twice  108. SURFACE INTEGRALS 411 the area of this graph. One has f/' = -x/f and f= -y/f. Therefore, J = (1+ X2/f2 + y2/f2)1/2 - (f2 - x2 _ y2)1/2/f = R/f. Hence, //dA2,R A(S) = 2Rff R2=22 2Rfd f D R / 2 2_y2 J0 0o /R2 -r2 =4R =rRdr -2R d c=4R2 Jo R2-r2 JO where the double integral has been converted to polar coordinates and the substitution u = R2 - r2 has been used to evaluate the last integral. D EXAMPLE 14.36. Find the area of the part of the paraboloid z = X 2 + y2 in the first octant and below the plane z = 4. SOLUTION: The surface in question is the graph z = f(x, y) = x2+ y2. Next, the region D must be specified (it determines the part of the graph whose area is to be found). One can view D as the vertical projection of the surface onto the xy plane. The plane z = 4 intersects the paraboloid along the circle 4 = x2+y2 of radius 2. Since the surface also lies in the first octant, D is the part of the disk x2 + y2 < 4 in the first quadrant. Then f' = 2x, f = 2y, and J = (1 + 4x2 + 4y2)1/2. The surface area is A(S) = /1 + 4x2 + 4y2 d A [/ = dB 21+A2rd D 0 i0 2 17 611 24 where the double integral has been converted to polar coordinates and the substitution ut= 1 + 4r2 has been used to evaluate the last integral. D 108.2. Surface Integral of a Function. An intuitive idea of the concept of the surface integral of a function can be understood from the follow- ing example. Suppose one wants to find the total human population on the globe. The data about the population are usually supplied as the population density (i.e., the number of people per unit area). The population density is not a constant function on the globe. It is high in cities and low in deserts and jungles. Therefore, the surface of the globe must be partitioned by surface elements of area ASP. If o-(r) is the population density as a function of position r on the globe, then the population on each partition element is approximately o-(r*) AS,,  412 14. MULTIPLE INTEGRALS 412 14. MULTIPLE INTEGRALS where r* is a sample point in the partition element. The approxima- tion neglects variations of o within each partition element. The total population is approximately the Riemann sum E o-(r*) ASp. To get an exact value, the partition has to be refined so that the size of each partition element becomes smaller. The limit is the surface integral of o over the surface of the globe, which is the total population. In general, one can think of some quantity distributed over a surface with some density (the amount of this quantity per unit area as a function of position on the surface). The total amount is the surface integral of the density over the surface. Let f be a bounded function in an open region E and let S be a surface in E that has a finite surface area. Consider a partition of S by N pieces Sp, p = 1, 2, ..., N, which have surface area ASp. Suppose that S is defined as a level surface g(x, y, z) = k of a function g that has continuous partial derivatives on E and whose gradient does not vanish. It was shown in Section 93.2 that given a point P on S there is a function of two variables whose graph coincides with S in a neighborhood of P. So the surface area AS, of a sufficiently small partition element S, can be found by Definition 14.21. Put m = infs, f and M, = sups, f; that is, m, is the largest lower bound of values of f for all r E S, and M, is the smallest upper bound on the values of f for all r E Sp. The upper and lower sums are defined by U(f, N)=EN_1 M, AS, and L(f, N) =EN_1 mp ASp. Let R, be the radius of the smallest ball that contains S, and max, Rp,= RN. A partition of S is said to be refined if RN' < RN for N' > N. In other words, under the refinement, the sizes R, of partition elements become uniformly smaller. DEFINITION 14.22. (Surface Integral of a Function). The surface integral of a bounded function f over a surface S is // f (r) dS lim U(f, N) lim L(f, N), isN->oo - provided the limits of the upper and lower sums exist and coincide. The limit is understood in the sense RN - 0 as N - oc. The surface integral can also be represented by the limit of a Rie- mann sum: N (14.33) ff f(r) dS = lim f r)AS, =Nlm ~fN) p=1 If the surface integral exists, it follows from the inequality m,~ f(r) < M, for all r E S§, that L(f, N) R(f, N) U(f, N), and by the  108. SURFACE INTEGRALS 413 108. SURFACE INTEGRALS 413 r* S FIGURE 14.39. Left: A partition of a surface S by ele- ments with surface area ASP. It is used in the definition of the surface integral and also to construct its Riemann sums. Right: A neighborhood Ea of a smooth surface S defined as the set of points whose distance to S cannot exceed a > 0. For sufficiently fine partition of S and small a, the region Ea is partitioned by elements of volume AV = a ASP. squeeze principle the limit of the Riemann sum is independent of the choice of sample points r*. Riemann sums can be used in numerical approximations of the surface integral. Similar to line integrals, surface integrals are related to triple in- tegrals. Consider a neighborhood Ea of a smooth surface S that is defined as the set of points whose distance to S cannot exceed a/2 > 0 (in the sense of Definition 11.14). The region Ea looks like a shell with thickness a (see the right panel of Figure 14.39). Suppose that f is integrable on Ea. Then, in the limit a -- 0, (14.34) 1 f(r) dV - f (r) dS. This relation can be understood by considering the Riemann sum (14.13) for the triple integral in (14.34) in which Ea is partitioned by volume elements AV = a AS, with sample points taken on S. Partition el- ements are cylinders of height a along the normal to the surface and with the area of the cross section AS, defined by the tangent plane. The factor 1/a on the right side of (14.34) cancels the common factor a in AV>, and the Riemann sums turns into a Riemann sum for a surface integral. Hence, the surface integral exists for any f that is continuous or has bounded jump discontinuities along a finite number of smooth curves on S, and it inherits all the properties of multiple integrals.  414 14. MULTIPLE INTEGRALS 414 14. MULTIPLE INTEGRALS 108.3. Evaluation of a Surface Integral. THEOREM 14.22. (Evaluation of a Surface Integral). Suppose that f is continuous in a region that contains a surface S defined by the graph z = g(x, y) on D. Suppose that g has continuous first-order partial derivatives on an open region that contains D. Then (14.35) fff(x,yz)dS = fff (x, y, g(x, y)) 1 + (g;)2 + (g')2 dA. Consider a partition of D by elements DP of area AA, p = 1, 2, ..., N. Let J(x, y) =V1 + (g')2 + (g')2. By the continuity of g' and g' J is continuous on D. By the integral mean value theorem, the area of the part of the graph z = g(x, y) over DP is given by AS =ff J(x, y) dA = J(x, y*) A A for some (, y*) E D. In the Riemann sum for the surface integral (14.33), take the sample points to be r* = (xy,g(*,y)) E Sp. The Riemann sum becomes the Riemann sum (14.3) of the function F(x, y) = f(x, y, g(x, y))J(x, y) on D. By the continuity of F, it con- verges to the double integral of F over D. The argument given here is based on a tacit assumption that the surface integral exists according to Definition 14.22, and hence the limit of the Riemann sum exists and is independent of the choice of sample points. It can be proved that under the hypothesis of the theorem the surface integral exists. The evaluation of the surface integral involves the following steps: Step 1. Represent S as a graph z = g(x, y) (i.e., find the function g using a geometrical description of S). Step 2. Find the region D that defines the part of the graph that coincides with S (if S is not the entire graph). Step 3. Calculate the derivatives g' and g' and the area transforma- tion function J, dS = J d A. Step 4. Evaluate the double integral (14.35). EXAMPLE 14.37. Evaluate the integral of f (x, y, z) = z over the part of the saddle surface z = zy that lies inside the cylinder x2+y2 1 in the first octant. SOLUTION: The surface is a part of the graph z =g(xc, y) =xy. Since it lies within the cylinder, its projection onto the zcy plane is bounded by the circle of unit radius, xc2+ y2 =1. Thus, D is the quarter of the  108. SURFACE INTEGRALS 415 disk x2 + y2 < 1 in the first quadrant. One has g'= y, g' = x, and J(x, y) (1 + x2 + y2)1/2. The surface integral is ffzdS =j yv/+x2+y2dA - Dfi7/2 f cos 8 sin 8 d8 r2/1+r 2 r dr 0 i0 sin20 r/2i 2 =- 2o (u -1)v/du 1 u5/2 u3/2 2 2(4v125+ 1) 2 5 3 1 15 ' where the double integral has been converted to polar coordinates and the last integral is evaluated by the substitution u = 1+ r2. 108.4. Parametric Equations of a Surface. The graph z =g(x, y) of a continuous function g, where (x, y) E D, defines a surface S in space. Consider the vectors r(u, v) = (u, v, g(u, v)) where the pair of parameters (u, v) spans the region D. For every pair (u, v), the rule r(u, v) = (u, v, g(u, v)) defines a vector in space, which is the po- sition vector of a point on the surface. One can make a continuous transformation from D to D' by changing variables (u, v) - (u', v'). Then the components of position vectors of points of S become gen- eral continuous functions of the new variables (u', v'). This obser- vation suggests that a surface in space can be defined by specifying three continuous functions of two variables, xv(u, v), y(u, v), and z(u, v), on a region D that are viewed as components of the position vector r(u, v) = (x(u, v), y(u, v), z(u, v)). A mapping of D into space defined by this rule is called a vector function on D. The range of this mapping is called a parametric surface in space, and the equations x = x(u, v), y = y(u, v), and z = z(u, v) are called parametric equations of the surface. For example, the equations (14.36) x= Rcos vsin u, y = R sin v sin u, z= R cos u are parametric equations of a sphere of radius R. Indeed, by comparing these equations with the spherical coordinates, one finds that (p, 0, 0) (R, u, v); that is, when (u, v) range over the rectangle [0, wr] x [0, 27r], the vector (x,y, z) =r(u, v) traces out the sphere p =R. An ap- parent advantage of using parametric equations of a surface is that the surface no longer needs to be represented as the union of graphs. For example, the whole sphere is described by the single vector-valued  416 14. MULTIPLE INTEGRALS 416 14. MULTIPLE INTEGRALS function (14.36) of two variables instead of the union of two graphs z = R2-x2-y2. DEFINITION 14.23. Let r(u, v) be a vector function on an open re- gion D that has continuous partial derivatives r' and r', on D. The range S of the vector function is called a smooth surface if S is covered just once as (u, v) ranges throughout D and the vector r' x r', is not 0. An analogy can be made with parametric equations of a curve in space. A curve in space is a mapping of an interval [a, b] into space defined by a vector function of one variable r(t). If r'(t) is continuous and r'(t) / 0, then the curve has a continuous tangent vector and the curve is smooth. Similarly, the condition r' x r' 0 ensures that the surface has a continuous normal vector just like a graph of a continuously differentiable function of two variables. This will be explained shortly after the discussion of a few examples. EXAMPLE 14.38. Find the parametric equations of the double cone z2 = x2 + y2. SOLUTION: Suppose z / 0. Then (x/z)2 + (y/z)2 = 1. A solution of this equation is x/z = cos u and y/z = sin u, where u E [0, 27). Therefore, the parametric equations are x=vcosu, y=vsinu, z=v, where (u, v) E [0, 2w) x (-oc, oc) for the whole double cone. Of course, there are many different parameterizations of the same surface. They are related by a change of variables (u, v) E D < (s, t) E D', where s =s(u, v) and t =t(u, v). D EXAMPLE 14.39. A torus is a surface obtained by rotating a circle about an axis outside the circle and parallel to its diameter. Find the parametric equations of a torus. SOLUTION: Let the rotation axis be the z axis. Let R be the distance from the z axis to the center of the rotated circle and let a be the radius of the latter, a < R. In the xz plane, the rotated circle is z2 +(x-a)2 R2. Let (zo, 0, zo) be a solution to this equation. The point (zo, 0, zo) traces out the circle of radius xo upon the rotation about the z axis. All such points are (x cos v, xo sin v, zo), where v E [0, 2w]. Since all points (zo, 0, z-) are on the circle z2 + (x - a)2 = R2, they can be parameterized as zo - a =R cos n, zo =R sin u, where u E [0, 2w]. Thus, the parametric equations of a torus are (14.37) x =(R-+-a cos u) cosvo, y =(R + acos u) sinvo, z =R sin u,  108. SURFACE INTEGRALS 417 z Vy X FIGURE 14.40. A torus. Consider a circle of radius R in the zx plane whose center is positioned on the positive x axis at a distance a > R. Any point (xo, 0, zo) on the circle is obtained from the point (a + R, 0, 0) by rotation about the center of the circle through an angle 0 < u < 2w so that zo = a + R cos u and zo = R sin u. A torus is a surface swept by the circle when the xz plane is rotated about the z axis. A generic point (x, y, z) on the torus is obtained from (xo, 0, zo) by rotating the latter about the z axis through an angle 0 < v K 27. Under this rotation, zo does not change and z = zo, while the pair (xo, 0) in the xy plane changes to (x, y) = (xo cos v, xo sin v). Parametric equations of a torus are x = (a+R cos u) cos v, y = (a+R cos u) sin v, z = R sin u, where (u, v) ranges over the rectangle [0, 27] x [0, 27]. where (u, v) E [0, 27] x [0, 27]. An alternative (geometrical) derivation of these parametric equations is given in the caption of Figure 14.40. Q A Tangent Plane to a Parametric Surface. The line v = vo in D is mapped onto the curve r = r(u, vo) in S (see Figure 14.41). The deriv- ative r(u, vo) is tangent to the curve. Similarly, the line u = uo in D is mapped to the curve r = r(uo, v) in S, and the derivative r,(uo, v) is tangent to it. If the cross product ru x rv does not vanish in D, then one can define a plane normal to the cross product at any point of S. Furthermore, if ru x r$ f 0 in a neighborhood of (no, vo), then,  418 14. MULTIPLE INTEGRALS 418 14. MULTIPLE INTEGRALS without loss of generality, one can assume that, say, the z component of the cross product is not 0: x'y' - 'y' = &(x, y)/o(u, v) / 0. This shows that the transformation x = x(u, v), y = y(u, v) with contin- uous partial derivatives has a nonvanishing Jacobian. By the inverse function theorem (Theorem 14.10), there exists an inverse transforma- tion u = u(x, y), v= v(x, y) that also has continuous partial deriva- tives. So the vector function r(u, v) can be written in the new variables (x,y) as R(x, y) = r(u(x, y), vv(x,y)) = (x,y,z(u(x,y),v(x,y)) (x,y,g(x,y)), which is a vector function that traces out the graph z = g(x, y). Thus, a smooth parametric surface near any of its points can always be repre- sented as the graph of a function of two variables. By the chain rule, the function g has continuous partial derivatives. Therefore, its lineariza- tion near (x, yo) = (x(uo, vo), y(uo, vo)) defines the tangent plane to the graph and hence to the parametric surface at the point ro = r(uo, vo). In particular, the vectors r', and r's must lie in this plane as they are tangent to two curves in the graph. Thus, the vector r's x r', is normal to the tangent plane. So Definition 14.23 of a smooth parametric sur- face agrees with the notion of a smooth surface introduced in Section 103.1 and the following theorem holds. THEOREM 14.23. (Normal to a Smooth Parametric Surface). Let r = r(u, v) be a smooth parametric surface. Then the vector n = r' x r, is normal to the surface Area of a Smooth Parametric Surface. Owing to the definition of the surface area element of the graph and the established relation between graphs and smooth parametric surfaces, the area a smooth surface can be found using the tangent planes to it (see Figure 14.38, left panel). Let a region D spanned by the parameters (u, v) be partitioned by rectangles of area AA =Au Av. Then the vector function r(u, v) defines a partition of the surface (a partition element of the surface is the image of a partition rectangle in D). Consider a rectangle [uo, uo + Au] x [vo, vo + Av] = Ro. Let its vertices O', A', and B' have the coordinates (uo, vo), (uo + Au, vo), and (uo, vo + Av), respectively. The segments 0'A' and 0'B' are the adjacent sides of the rectangle Ro. Let 0, A, and B be the images of these points in the surface. Their position vectors are ro r(uo, vo), ra =r(uo +Au, vo), and ra =r(uo, vo-+-Av), respectively. The area AS of the image of the rectangle Ro can be  108. SURFACE INTEGRALS 419 D \ iiiiiiiiiiiiiii......................ii.iiii'..... ci' r(a, v) r FIGURE 14.41. The lines u = uo and v = vo in D are mapped onto the curves in S that are traced out by the vector functions r = r(uo, v) and r = r(u, vo), respectively. The curves intersect at the point 0 with the position vector r(uo, vo). The derivatives r'v(uo, vo) and r'(uo, vo) are tan- gential to the curves. If they do not vanish and are not paral- lel, then their cross product is normal to the plane through 0 that contains r' and r',. If the parametric surface is smooth, this plane is tangent to it. approximated by the area of the parallelogram ||a x b with adjacent sides: a = OA = ra - ro = r(uo + ,Au, vo) - r(uo, vo) = r'u(o, vo) Au, b = OB = rb - ro = r(uo, vo + Av) - r(uo, vo) = r,(uo, vo) Av. The last equalities are obtained by the linearization of the components of r(u, v) near (uo, vo), which is justified because the surface has a tan- gent plane at any point. The area transformation law is now easy to find: AS = ||a x b|| = ||r' x r', llAA. Thus, the surface area of a smooth parametric surface is given by the double integral A(S) = ||r' x r' ||dA. Accordingly, the surface integral of a function f(r) over a smooth para- metric surface is Jf (r )dS =jf (r~u,v))||r',x r'| dA. E XAMPLE 14.40. Find the surface area of the torus (14.37).  420 14. MULTIPLE INTEGRALS 420 14. MULTIPLE INTEGRALS SOLUTION: To shorten the notation, put w = R + a cos u. One has ru = (-a sin u cos v, -a sin u sin v, R cos u), rv = (-(R + a cos u) sin v, (R + a cos u) cos v, 0) = w(- sin v, cos v, 0), n=ra xr= w(-acos vcos u, -a cos vcos u, -a sin u), J = ||ru x rv|| = aw= a(R+acosu). The surface area is / // 27r27r A(S) =f J(u, v) dA = fJ a(R + a cos u) dv du = 472Ra. D 0 0 EXAMPLE 14.41. Evaluate the surface integral of f(x, y, z) z2(x2 +y2) over a sphere of radius R centered at the origin. SOLUTION: Using the parametric equations (14.36), one finds ru = (R cosv cos u, R sinvcos u, -R sinu), r = (-R sin vsinu, R cosv sinu, 0) = R sin u(- sin v, cos v, 0), n = r x r = R sinu(R sinu cos v, R sinu sin v, R cos u) = R sin u r(u, v), J = r x r|| = Rsinu lr(u,v)|| =R2sinu, f (r(u, v)) = (R cosu)2R2sin2 26= R4 cos2u(1 - cos22). Note that sin u;> 0 because u E [0, 7] (u = # and v =0B). Therefore, the normal vector n is outward (parallel to the position vector; the inward normal would be opposite to the position vector.) The surface integral is Jf f dS=ff f(r(u, v))J(u, v) dA R6f dv f cos2 u(1 - cos2 u) sin u du where the substitution w =cos has been made to evaluate the last integral.D  108. SURFACE INTEGRALS 421 108.5. Exercises. (1) Find the surface are a of the specified surface: (i) The part of the plane in the first octant that intersects the coordinate axes at (a, 0, 0), (0, b,0), and (0, 0, c), where a, b, and c are positive numbers (ii) The part of the plane 3x + 2y + z = 1 that lies inside the cylinder x2 + y2 4 (iii) The part of the hyperboloid z = y2 - x2 that lies between the cylinders x2 + y2 = 1 and x2 + y2 4 (iv) The part of the paraboloid z = x2 + y2 that lies between two planes z= 1 and z = 9 (v) The part of the surface y = 4x+z2 that lies between the planes x= 0, x= 1, z= 0, and z= 1 (2) Evaluate the integral over the specified surface: (i) ffs yz dS, where S is the part of the plane c + y + z = 1 that lies in the first octant (ii) ff x2z2 dS, where S is the part of the cone z2 = x2 + y2 that lies between the planes z = 1 and z = 2 (iii) ffs zz dS, where S is the boundary of the solid region enclosed by the cylinder y2+z2 = 1 and the planes x = 0 and x+y = 3 (Hint: Use the additivity of the surface integral.) (iv) ffs z dS, where S is the part of the sphere x2 + y2 + z2 = 2 that lies above the plane z = 1 (v) ffs z(sin(x2) - sin(y2)) dS, where S is the part of the parabo- loid z = 1 - x2 - y2 that lies in the first octant (Hint: Use the symmetry.) (3) Suppose that f(r) = g(llr||), where r = (x, y, z). If g(a) = 2, use the geometrical interpretation of the surface integral to find ffs fdS, where S is the sphere of radius a centered at the origin. (4) Identify and sketch the parametric surface: (i) r(u, v) = (u + v, 2 - v, 2 - 2v + 3u) (ii) r(u, v) =_(a cos u, b sin u, v) (5) For the given parametric surface, sketch the curves r(u, vo) for several fixed values v = vo and the curves r(uo, v) for several fixed values u =to. Use them to visualize the parametric surface if (i) r~u, v) =(sin v, ttsin v, sinuasin(2v)) (ii) r~u, v) =(ttcosvosinO6, usintt sinO,uacosO6), where 0 0 wr/2 is a parameter (6) Find a parametric representation for the following surfaces  422 14. MULTIPLE INTEGRALS 422 14. MULTIPLE INTEGRALS (i) The plane through ro that contains two nonzero and nonpar- allel vectors a and b (ii) The elliptic cylinder y2/a2 + z2/b2 1 (iii) The part of the sphere x2+ y2 + z2 = a2 that lies below the cone z = x2 + y2 (iv) The ellipsoid x2/a2 + y2/b2 + z2/c2 1 (7) Find an equation of the tangent plane to the given parametric surface at the specified point P: (i) r(u, v) = (u2, u - v, u + v) at P = (1,-1,3) (ii) r(u, v) = (sin v, usin v, sin u sin(2v)) at P = (1,w7/2, 0) (8) Evaluate the surface integral over the specified parametric surface: (i) ffs z2 dS, where S is the torus (14.37) with R = 1 and a = 2 (ii) ff(1 +x2 +y2)1/2 dS, where S is the helicoid with parametric equations r(u, v) (ucos v, u sin v, v) and (u, v) E [0, 1] x [0,7] (iii) ffs z dS, where S is the part of the helicoid r(u, v) = (u cos v, u sin v, v), (U, v) E [0, a] x [0, 27] (iv) ffs z2 dS, where S is the part of the cone x =tu cos v sin 0, y = usin u sinO8, z = u cosOB, and (u, v) E [0, a] x [0, 2w], and 0 < 0 R > 0. 109. Moments of Inertia and Center of Mass An important application of multiple integrals is finding the center of mass and moments of inertia of an extended object. The laws of mechanics say that the center of mass of an extended object on which no external force acts moves along a straight line with a constant speed. In other words, the center of mass is a particular point of an extended object that defines the trajectory of the object as a whole. The motion of an extended object can be viewed as a combination of the motion of its center of mass and rotation about its center of mass. The kinetic energy of the object is Mv2 K = + Krot, 2 where M is the total mass of the object, v is the speed of its center of mass, and Krot is the kinetic energy of rotation of the object about its center of mass; Krot is determined by moments of inertia discussed later. For example, when docking a spacecraft to a space station, one needs to know exactly how long the engine should be fired to achieve the required position of its center of mass and the orientation of the craft relative to it, that is, how exactly its kinetic energy has to be changed by firing the engines. So its center of mass and moments of inertia must be known to accomplish the task. 109.1. Center of Mass. Consider a point mass m fixed at an endpoint of a rod that can rotate about its other end. If the rod has length L and the gravitational force is normal to the rod, then the quantity gmL is called the rotational moment of the gravitational force mg, where g is the free-fall acceleration. If the rotation is clockwise (the mass is at the right endpoint), the moment is assumed to be positive, and it is negative, -gmnL, for a counterclockwise rotation (the mass is at the left endpoint). More generally, if the mass has a position xc on the cc axis, then its rotation moment about a point c is M= (xc - c)m (omitting the constant g). It is negative if cc < c and positive when cc > ccc.  424 14. MULTIPLE INTEGRALS 424 14. MULTIPLE INTEGRALS m2 Ml m2 r MI r1 r3 m3 xC X1 Xc X2 FIGURE 14.42. Left: Two masses connected by a rigid massless rod (or its mass is much smaller than the masses mi and m2) are positioned at x1 and x2. The gravitational force is perpendicular to the rod. The center of mass xc is deter- mined by the condition that the system does not rotate about xc under the gravitational forces. Right: An extended ob- ject consisting of point masses with fixed distances between them. If the position vectors of the masses relative to the center of mass C are r2, then mir1+m2r2+-" --+mNrN =0. The center of mass is understood through the concept of rotational moments. The simplest extended object consists of two point masses mi and m2 connected by a massless rod. It is shown in the left panel of Fig- ure 14.42. Suppose that one point of the rod is fixed so that it can only rotate about that point. The center of mass is the point on the rod such that the object would not rotate about it under a uniform grav- itational force applied along the direction perpendicular to the rod. Evidently, the position of the center of mass is determined by the con- dition that the total rotational moment about it vanishes. Suppose that the rod lies on the x axis so that the masses have the coordinates x1 and x2. The total rotational moment of the object about the point xc is M = M1 + M2 = (X1 - xc)mi + (x2 - Xc)m2. If xc is such that M = 0, then ii + m2x2 mI(XI - xc) + m2(x2 - xc) = 0 -- C i.+2 mi + m2 The center of mass (cc, yc) of point masses m2, i= 1, 2, ..., N, positioned on a plane at (xi, y2) can be understood as follows. Think of the plane as a plate on which the masses are positioned. The gravitational force is normal to the plane. If a rod (a line) is put underneath the plane, then due to an uneven distribution of masses, the plane can rotate about the rod. When the rod is aligned along either the line = c or the line y =yc, the plane with distributed masses on it does not rotate  109. MOMENTS OF INERTIA AND CENTER OF MASS 425 under the gravitational pull. In other words, the rotational moments about the lines x = xc and y yc vanish. The rotational moment about the line x = xc or y yc is determined by the distances of the masses from this line: N NN (xi-vi - 0 => xC - mix rnm m mn Z i i=1 i=1 i=1 N 1N M Z(y2-y,)m2ni-0 y _- My 1 Mjj r i=1 mi=1m where m is the total mass. The quantity My is the moment about the y axis (the line x= 0), whereas MM is the moment about the x axis (the line y = 0). Consider an extended object that is a collection of point masses shown in the right panel of Figure 14.42. Its center of mass is defined similarly by assuming that the total moments about any of the planes x = xc, or y =_yc, or z = zc vanish. Thus, if rc is the position vector of the center of mass, it satisfies the condition: mi(ri - rc) = 0, where the vectors ri - rc are position vectors of masses relative to the center of mass. DEFINITION 14.24. (Center of Mass). Suppose that an extended object consists of N point masses mi, i 1, 2, ..., N, whose position vectors are ri. Then its center of mass is a point with the position vector 1N N (14.38) rc = m rr, m mi, i=1 i=1 where m is the total mass of the object. The quantities N N N MYz Zmiz, Mzz = my, My =3 m z i=1 i=1 i=1 are called the moments about the coordinate planes. If an extended object contains continuously distributed masses then the object can be partitioned into N small pieces. Let By be the smallest ball of radius Ri within which the ith partition piece lies. Although all the partition pieces are small, they still have finite sizes Ri, and the definition (14.38) cannot be used because the point ri could  426 14. MULTIPLE INTEGRALS 426 14. MULTIPLE INTEGRALS be any point in B2. By making the usual trick of integral calculus, this uncertainty can be eliminated by taking the limit N - oc in the sense that all the partition sizes tend to 0 uniformly, RZ < maxi RZ = RN - 0 as N - oc. In this limit, the position of each partition piece can be described by any sample point r E B2. The limit of the Riemann sum is given by the integral over the region E in space occupied by the ob- ject. If o-(r) is the mass density of the object, then Am = (r ) AV, where AV is the volume of the ith partition element and 1 1 rc = lim r2Ami=-Jj ro-(r) dV, m = o-(r) dV. In practical applications, one often encounters extended objects whose one or two dimensions are small relative to the other (e.g., shell-like objects or wirelike objects). In this case, the triple integral is simplified to either a surface (or double) integral for shell-like E, according to (14.34), or to a line integral, according to (14.31). For two- and one- dimensional extended objects, the center of mass can be written as, respectively, rc = r o-(r) dS, m = o-(r) dS, rc = r o-(r) ds, m = o-(r) ds, where, accordingly, o is the surface mass density or the line mass den- sity for two- or one-dimensional objects. In particular, when S is a planar, flat surface, the surface integral turns into a double integral. The concept of rotational moments is also useful for finding the center of mass using the symmetries of the mass distribution of an extended object. For example, the center of mass of a disk with a uniform mass distribution apparently coincides with the disk center (the disk would not rotate about its diameter under the gravitational pull). EXAMPLE 14.42. Find the center of mass of the half-disk x2 + y2 < R2, y > 0, if the mass density at any point is proportional to the distance of that point from the x axis. SOLUTION: The mass is distributed evenly to the left and right from the y axis because the mass density is independent of x, o-(x, y) =ky (k is a constant). So the rotational moment about the y axis vanishes;  109. MOMENTS OF INERTIA AND CENTER OF MASS 427 My = 0 by symmetry and hence xc = My/m = 0. The total mass is m ff odA =kffydA kff rsin rdrd = 2k r2 dr 2kR3 03,' where the integral has been converted to polar coordinates. The mo- ment about the x axis (about the line y = 0) is /// fR kfR kR4 M = ff yodA(r sin )2r dr dJ ro. DJ0J0 2 So yc = Mx/m = 37R/16. D EXAMPLE 14.43. Find the center of mass of the solid that lies be- tween spheres of radii a < b centered at the origin and is bounded by the cone z = x2 + y2/ 3,/ if the mass density is constant. SOLUTION: The mass is evenly distributed about the xz and yz planes. So the moments Mzz and Myz about them vanish, and hence yc Mxz/m = 0 and xc = Myz/m = 0. The center of mass lies on the z axis. Put o = k = const. The total mass is fff /2dv73 JJJodV= kf2 s/ p2sin# dp d#d(b3 -a3), where the triple integral has been converted to spherical coordinates. The boundaries of E are the spheres p = a and p =b and the cone defined by the condition cot # = 1/v3 or # = w/3. Therefore, the image E' of E under the transformation to spherical coordinates is the rectangle (p, #, 0) E E' = [a, b] x [0, 7/3] x [0, 27]. The full range is taken for the polar angle 0 as the equations of the boundaries impose no condition on it. The moment about the xy plane is /// /27r 7/3 b M ff zo-dV= kfffcp cos p2sin dpcddO 3wk = (b4 - a4). So zc =Mfy/m =(9/16)(a + b)(a2 + b2)/(a2 + ab + b2). D Centroid. The center of mass of an extended object with a constant mass density is called the centroid. The centroid of a region depends only on the shape of the region. In this sense, the centroid is an intrinsic (geometrical) characteristic of the region.  428 14. MULTIPLE INTEGRALS E C~m r2 FIGURE 14.43. Left: The moment of inertia of a point mass about an axis y. A point mass rotates about an axis -' with the rate o, called the angular velocity. Its linear ve- locity is v = oR, where R is the distance from -y. So the ki- netic energy of the rotational motion is mv2/2 = mR2w2/2 Iwo2/2, where I = mR2 is the moment of inertia. Right: The moment of inertia of an extended object E about an axis -' is defined as the sum of moments of inertia of parti- tion elements of E: AI= Am(r )R2(r ), where R y(r ) is the distance to the axis y and Am(r ) is the mass of the partition element. 109.2. Moments of Inertia. Consider a point mass m rotating about an axis -y at a constant rate of w rad/s (called the angular velocity). The system is shown in Figure 14.43 (left panel). If the radius of the circular trajectory is R, then the linear velocity of the object is v = oR. The object has the kinetic energy mv2 mR2w2 I~ w2 Krot = . 2 2 2 The constant I is called the moment of inertia of the point mass m about the axis -y. Similarly, consider an extended solid object consisting of N point masses. The distances between the masses do not change when the object moves (the object is solid). So, if the object rotates about an axis -y at a constant rate o, then each point mass rotates at the same rate and hence has kinetic energy miRyo2/2, where Ri is the distance from the mass mi to the axis -y. The total kinetic energy is Krot = ILo2/2, where the constant N I = miR2 i=1  109. MOMENTS OF INERTIA AND CENTER OF MASS 429 is called the moment of inertia of the object about the axis T. It is independent of the motion itself and determined solely by the mass distribution and distances of the masses from the rotation axis. Suppose that the mass is continuously distributed in a region E with the mass density o(r) (see the right panel of Figure 14.43). Let Ry(r) be the distance from a point r E E to an axis (line) T. Consider a partition of E by small elements EZ of volume AV . The mass of each partition element is Am = o(r2) AV for some sample point r2 E EZ in the limit when all the sizes of partition elements tend to 0 uniformly. The moment of inertia about the axis 7 is NI, = lim R(rZ)o(r) A = j R((r)o(r) dV Nr)odE in accordance with the Riemann sum for triple integrals (14.13). In particular, the distance of a point (x, y, z) from the x, y, and z axes is, respectively, R= = y2 + z2, Ry = v/2 + z2, and Rz = v2 + y2. So the moments of inertia about the coordinate axes are IX=fff(y2 + z2)odV , I=fff(x2 +z2) .dV Iz f= (cX2 +y2). aV In general, if the axis -y goes through the origin parallel to a unit vector n, then by the distance formula between a point r and the line, Rry(r) =|6x r|| 2 =(6xr)-(6 x r) = 6-(r x (n xr)) (14.39) = r2 - (6u-.r)2, where the bac - cab rule (11.9) has been used to transform the double cross product. If one or two dimensions of the object are small relative to the other, the triple integral is reduced to either a surface integral or a line integral, respectively, in accordance with (14.34) or (14.31); that is, for two- or one-dimensional objects, the moment of inertia becomes, respectively, I_,=_ R((r)o-(r) dS , I,= R((r)o-(r) ds, where o- is either the surface or linear mass density. E XAMPLE 14.44. A rocket tip is made of thin plates with a constant surface mass density o-u k. It has a circular conic shape with base  430 14. MULTIPLE INTEGRALS 430 14. MULTIPLE INTEGRALS X FIGURE 14.44. Left: An illustration to Example 14.44. Right: An illustration to the proof of the parallel axis theo- rem for moments of inertia (Study Problem 14.11). The axis 7c is parallel to -' and goes through the center of mass with the position vector rc. The vectors r and r - rc are position vectors of a partition element of mass Am relative to the origin and the center of mass, respectively. diameter 2a and distance h from the tip to the base. Find the moment of inertia of the tip about its axis of symmetry. SOLUTION: Set up the coordinate system so that the tip is at the origin and the base lies in the plane z = h; that is, the symmetry axis coincides with the z axis. If # is the angle between the z axis and the surface of the cone, then cot # = h/a and the equation of the cone is z = cot # i2 + y2. Thus, the object in question is the surface (graph) z = g(x, y) = (h/a) x2 + y2 over the region D: x2 + y2 < a2 To evaluate the needed surface integral, the area transformation law dS = JdA should be established. One has g' = (hx/a)(x2 + y2)-1/2 and g' = (hy/a)(x2 + y2)-1/2 so that h/ 2 + a2 Jz= 1+ (g')2 + (g')2 /1+ (h/a)2 - a The moment of inertia about the z axis is I = (x2 + Y2)o-dS = k (z2 Y2)JdA 2J d r3 d r = rka3 h2 + a2. 2  109. MOMENTS OF INERTIA AND CENTER OF MASS 431 EXAMPLE 14.45. Find the moment of inertia of a homogeneous ball of radius a and mass m about its diameter. SOLUTION: Set up the coordinate system so that the origin is at the center of the ball. Then the moment of inertia about the z axis has to be evaluated. Since the ball is homogeneous, its mass density is constant, o-= m/V, where V = 47a3/3 is the volume of the ball. One has Iz fff(x2+y2)udV 3 2r f2r f (psin#)2sinidpddi3 dO = ma2 sins Odoi= 1ma2f(1-u2)_du ma2 where the substitution u = cos # has been made to evaluate the inte- gral. It is noteworthy that the problem admits a smarter solution by noting that Iz = Ix = Iy owing to the rotational symmetry of the mass distribution. By the identity Iz = (Ix + Iy + Iz)/3, the triple integral can be simplified: ffIzf= - 2(x2 + 22 + z2)dV U8 pa4dp ma2 E EXAMPLE 14.46. Find the center of mass and the moment of inertia of a homogeneous rod of mass m bent into a half-circle of radius R about the line through the endpoints of the rod. SOLUTION: Set up the coordinate system so the half-circle lies above the x axis: x2+ y2 = R2, y >0. The linear mass density is constant o = m/(wR), where 7R is the length of the rod. By the symmetry of the mass distribution, the center of mass lies on the y axis, xc = 0, and yc= (1/mn) f yo ds. To evaluate the line integral, choose the following parametric equation of the half-circle r(t) = (R cos t, R sin t), 0 < t < w. Then r'(t) = (-Rsint, R cos t) and ds = ||r'(t)||dt = Rdt. Therefore, 1_1 s2R ye = f yo-ds = fRsintRdt= mCo Ro If Ry is the distance from the line connecting the endpoint of the rod to its particular point, then, in the chosen coordinate system, R17= y. Therefore, the moment of inertia in question is I m R2 *mR2 f, Ro-ds= yf2ds=mi2 sin2tdt= . c xR x oD  432 14. MULTIPLE INTEGRALS 432 14. MULTIPLE INTEGRALS 109.3. Study Problems. Problem 14.10. Find the center of mass of the shell described in Example 14.44. SOLUTION: By the symmetry of the mass distribution about the axis of the conic shell, the center of mass must be on that axis. Using the algebraic description of a shell given in Example 14.44, the total mass of the shell is m = ff dS=kf dS=kJ dAk JA(D) =ckah2 +a2. The moment about the xy plane is Vfxy = ffzdS kff (h/a)/x2 + y2J dA kJh ff 2 +y2dA _fkJh f2wfa2ddO 2wkha h2 + a2. a 3 Thus, the center of mass is at the distance zc May/m = 2h/3 from the tip of the cone. D Problem 14.11. (Parallel Axis Theorem). Let I, be the moment of inertia of an extended object about an axis 7 and let ye be a parallel axis through the center of mass of the object. Prove that I k = Iyc + mR2, where Re is the distance between the axis T and the center of mass and m is the total mass. SOLUTION: Choose the coordinate system so that the axis ry goes through the origin (see the right panel of Figure 14.44). Let it be parallel to a unit vector 6. The difference I,- Ige is to be investigated. If rc is the position vector of the center of mass, then the axis 7c is obtained from y by parallel transport of the latter along the vector rc. Therefore, the distance Ri(r) is obtained from R4(r) (see (14.39)) by changing the position vector r in the latter to the position vector relative to the center of mass, r - rc. In particular, Rg(rc) = Rc by the definition of the function Ry(r). Hence, by (14.39), R((r) - Ri(r) =R((r) - R((r - rc) = 2rc - r - r- (6 -. rc)(26i - r - 6 rc) =R- 2a - (r - r)  109. MOMENTS OF INERTIA AND CENTER OF MASS 433 where a = rc - (ii rc)6. Therefore, I, -I = fff(R(r) - R4(r))uo-(r) dV - R2 .j(r) dV - 2a ffj(r - r)j(r) dV Rcm, where the second integral vanishes by the definition of the center of mass. Q Problem 14.12. Find the moment of inertia of a homogeneous ball of radius a and mass m about an axis that is at a distance R from the ball center. SOLUTION: The center of mass of the ball coincides with its center because the mass distribution is invariant under rotations about the center. The moment of inertia of the ball about its diameter is Ig (2/5)ma2 by Example 14.45. By the parallel axis theorem, for any axis ry at a distance R from the center of mass, I, = Iyc + mR2 m(R2 + 2a2/5). D 109.4. Exercises. (1) Find the center of mass of the specified extended object: (i) A homogeneous thin rod of length L (ii) A homogeneous thin wire that occupies the part of a circle of radius R that lies in the first quadrant (iii) A homogeneous thin wire bent into one turn of the helix of radius R that rises by the distance h per each turn (iv) A homogeneous thin shell that occupies a hemisphere of radius R (v) A homogeneous thin disk of radius R that has a circular hole of radius a < R/2 and whose center is at the distance R/2 from the disk center (vi) A homogeneous solid enclosed by the ellipsoid x2/a2 + y2/b2 + z2/c2 = 1 that has a square box cavity [0, h] x [0, h] x [0, h] (vii) The part of the ball x2 + y2 + z2 < 4 that lies above the cone zv/ 3 c/2+ y2 and whose mass density at any point is proportional to its distance from the origin (viii) The part of the spherical shell a2 < X2 + y2 + z2 < b2 that lies above the zcy plane and whose mass density at any point is proportional to its distance from the z axis (ix) The part of the disk xc2 +y2 =a2 in the first quadrant bounded by the lines y =cc and y =3v/5 if the mass density at any point is proportional to its distance from the origin  434 14. MULTIPLE INTEGRALS (x) The part of the solid enclosed by the paraboloid z = 2- 2 _ 2 and the cone z =2 + y2 that lies in the first octant and whose mass density at any point is proportional to its distance from the z axis (xi) A homogeneous surface cut from the cone z x=/2 + y2 by the cylinder x2+ y2 = ax (xii) The part of a homogeneous sphere defined by z = /a2 - x2 -, x >0, y > 0, x + y a (xiii) The are of the homogeneous cycloid x= a(t - sin t), y = a(1 - cost), 0 R that has mass in. The moment of inertia about the symmetry axis of the torus.  109. MOMENTS OF INERTIA AND CENTER OF MASS 435 (iv) The moments of inertia I., and I, of the part of the disk of radius a that lies in the first quadrant and whose mass density at any point is proportional to its distance from the y axis. (v) The solid homogeneous cone with height h and radius of the base a. The moments of inertia about its symmetry axis, the axis through its vertex and perpendicular to the symmetry axis, and an axis that contains a diameter of the base. (vi) The part of the homogeneous plane x+y+z = a; the moments of inertia about the coordinate axes. (vii) The homogeneous triangle of mass m whose vertices in polar coordinates are (r,O) =(a,0), (a,27r/3), (a,47r/3); the polar moment of inertia Io = Ix + Iy. (viii) The homogeneous solid cylinder x2 + y2 < a2, -h < z < h of mass m; the moment of inertia about the line parallel to the z axis through the point (a, 0, 0); (ix) The homogeneous solid of mass density co bounded by the surface (x 2 + y2+ z2)2 = a2(x2 + y2); the sum of the moments of inertia Ix + Iy + Iz. (x) The lamina with a constant mass density co bounded by the circle (x - a)2 +(y - a)2 = a2 and by the segments 0 < y < a, 0 < x < a; the moments of inertia about the coordinate axes. (xi) The lamina with a constant mass density co bounded by the curves zy = a2, zy = 2a2, c = 2y, 2x = y; the moments of inertia about the coordinate axes. (xii) The solid that has a constant mass density co and is bounded by the ellipsoid (c/a)2+(y/b)2+(z/c)2 = 1; moments of inertia about the coordinate axes. (xiii) The spherical homogeneous shell of mass m and radius R; the moment of inertial about a diameter of the sphere.   CHAPTER 15 Vector Calculus 110. Line Integrals of a Vector Field 110.1. Vector Fields. Consider an airflow in the atmosphere. The air velocity varies from point to point. In order to describe the motion of the air, the air velocity must be defined as a function of position, which means that a velocity vector has to be assigned to every point in space. In other words, in contrast to ordinary functions, the air velocity is a vector-valued function of the position vector in space. DEFINITION 15.1. (Vector Field). Let E be a subset in space. A vector field on E is a function F that assigns to each point r = (x,y,z) a vector F(r) =_(Fi(r),F2(r),F3(r)). The functions F1, F2, and F3 are called the components of the vector field F. A vector field is continuous if its components are continuous. A vec- tor field is differentiable if its components are differentiable. A simple example of a vector field is the gradient of a function, F(r) = Vf(r). The components of this vector field are the first-order partial deriva- tives: F(r) = Vf (r) <> F1(r) = f'(r), F2(r) = f'(r), F3(r) = fgr). Many physical quantities are described by vector fields. Electric and magnetic fields are vector fields. All modern communication devices (radio, TV, cell phones, etc.) use electromagnetic waves. Visible light is also electromagnetic waves. The propagation of electromag- netic waves in space is described by differential equations that relate electromagnetic fields at each point in space and each moment of time to a distribution of electric charges and currents (e.g., antennas). The gravitational force looks constant near the surface of the Earth, but on the scale of the solar system this is not so. If one thinks about a planet as a homogeneous ball of mass M, then the gravitational force exerted by it on a point mass m depends on the position of the point mass relative to the planet's center according to Newton's law of gravity: F(r) =-GJ3 r =(-GMm+3, -GMm 3, -GMmn+, 437  438 15. VECTOR CALCULUS F FIGURE 15.1. Left: Flow lines of a vector field F are curves to which the vector field is tangential. The flow lines are oriented by the direction of the vector field. Right: Flow lines of the vector field F = (-y, x, 0) in Example 15.1 are concentric circles oriented counterclockwise. The magnitude F = x2 + y2 is constant along the flow lines and linearly increases with increasing distance from the origin. where G is Newton's gravitational constant, r is the position vector relative to the planet's center, and r = ||r is its length. The force is proportional to the position vector and hence parallel to it at each point. The minus sign indicates that F is directed opposite to r, that is, the force is attractive; the gravitational force pulls toward its source (the planet). The magnitude F = GMmr-2 decreases with increas- ing distance r. So the gravitational vector field can be visualized by plotting vectors of length F at each point in space pointing toward the origin. The magnitudes of these vectors become smaller for points farther away from the origin. This observation leads to the concept of flow lines of a vector field. 110.2. Flow Lines of a Vector Field. DEFINITION 15.2. (Flow Lines of a Vector Field). The flow line of a vector field F is a curve in space such that, at any point r, the vector field F(r) is tangent to it. The direction of F defines the orientation of flow lines. The di- rection of a tangent vector F is shown by arrows on the flow lines as depicted in the left panel of Figure 15.1. For example, the flow lines of the planet's gravitational field are straight lines oriented toward the center of the planet. Flow lines of a gradient vector field F = Vf $ 0 are normal to level surfaces of the function f and oriented in the direc- tion in which f increases most rapidly (Theorem 13.16). They are the  110. LINE INTEGRALS OF A VECTOR FIELD 439 curves of steepest ascent of the function f. Flow lines of the velocity vector field of the air are often shown in weather forecasts to indicate the wind direction over large areas. For example, flow lines of the air velocity in a hurricane would look like closed loops around the eye of the hurricane. The qualitative behavior of flow lines may be understood by plotting vectors F at several points r2 and sketching curves through them so that the vectors FZ = F(r2) are tangent to the curves. Finding the exact shape of the flow lines requires solving differential equations. If r = r(t) is a parametric equation of a flow line, then r'(t) is parallel to F(r(t)). So the derivative r'(t) must be proportional to F(r(t)), which defines a system of differential equations for the components of the vector function r(t), for example, r'(t) = F(r(t)). To find a flow line through a particular point ro, the differential equations must be supplemented by initial conditions, for example, r(to) = ro. If the equations have a unique solution, then the flow through ro exists and is given by the solution. EXAMPLE 15.1. Analyze flow lines of the planar vector field F = (-y, x, 0). SOLUTION: By noting that F - r = 0, it is concluded that, at any point, F is perpendicular to the position vector r = (x, y, 0) in the plane. So flow lines are curves whose tangent vector is perpendicular to the position vector. If r = r(t) is a parametric equation of such a curve, then r(t) - r'(t) = 0 or (d/dt)r2(t) = 0 and hence r2(t) = const, which is a circle centered at the origin. So flow lines are concentric circles. At the point (1, 0, 0), the vector field is directed along the y axis: F(1, 0, 0) = (0, 1, 0) =6e2. Therefore, the flow lines are oriented counterclockwise. The magnitude F| = 2 + y2 remains constant on each circle and increases with increasing circle radius. The flow lines are shown in the right panel of Figure 15.1. D 110.3. Line Integral of a Vector Field. The work done by a constant force F in moving an object along a straight line is given by W=F-d, where d is the displacement vector (Section 73.5). Suppose that the force varies in space and the displacement trajectory is no longer a straight line. What is the work done by the force? This question is evidently of great practical significance. To answer it, the concept of the line integral of a vector field was developed.  440 15. VECTOR CALCULUS C F(rz FIGRE05..Lft VToRult thewokConebyaUon c'i+ '2 R FIGURE 15.2. Left: To calculate the work done by a con- tinuous force F(r) in moving a point object along a smooth curve C, the latter is partitioned into segments Ci with arc length As. The work done by the force along a partition seg- ment is F(r ) - di, where the displacement vector is approxi- mated by the oriented segment of length As that is tangent to the curve at a sample point r, that is, di = T(rz) As, where T is the unit tangent vector along the curve. Right: An illustration to Example 15.2. The closed contour of inte- gration in the line integral consists of two smooth pieces, one turn of the helix C1 and the straight line segment C2. The line integral is the sum of line integrals along C1 and C2. Let C be a smooth curve that goes from a point ra to a point rb and has length L. Consider a partition of C by segments Ci, i = 1, 2, ..., N, of length As = L/N. Following the discussion of smooth curves in Sections 80.3 and 84.3, each segment can be approximated by a straight line segment of length As oriented along the unit tangent vector T(r2) at a sample point r E Ci (see the left panel of Figure 15.2). The work along the segment Ci can therefore be approximated by AWi = F(r2) . T(r2) As so that the total work is approximately the sum W = AW1 + AW2 +. + AWN. The actual work should not depend on the choice of sample points. This problem is resolved by the usual trick of integral calculus by refining a partition, finding the low and upper sums, and taking their limits. If these limits exist and coincide, the limiting value should not depend on the choice of sample points and is the sought-after work. The technicalities involved may be spared by noting that AWi = f(r2) As, where f(r) = F(r).T(r) and T(r) denotes the unit tangent vector at a point r E C. The approximate total work appears to be a Riemann sum of f along C. So, if the function f  110. LINE INTEGRALS OF A VECTOR FIELD 441 is integrable on the curve C, then the work is the line integral of the tangential component F- T of the force. DEFINITION 15.3. (Line Integral of a Vector Field). The line integral of a vector field F along a smooth curve C is fF -dr= F-ids, where T is the unit tangent vector to C, provided the tangential com- ponent F -"T of the vector field is integrable on C. The integrability of F - T is defined in the sense of line integrals for ordinary functions (see Definition 14.20). In particular, the line integral of a continuous vector field over a smooth curve of a finite length always exists. 110.4. Evaluation of Line Integrals of Vector Fields. The line integral of a vector field is evaluated in much the same way as the line integral of a function. THEOREM 15.1. (Evaluation of Line Integrals). Let F = (F1, F2, F3) be a continuous vector field on E and let C be a smooth curve C in E that originates from a point ra and terminates at a point rb. Suppose that r(t) = (x(t), y(t), z(t)), t E [a, b], is a vector function that traces out the curve C only once so that r(a) = ra and r(b) =rb. Then F(r) - dr = F -"T ds = F(r(t)) - r'(t) dt C C j a (15.1) jb(Fl(r(t))x'(t) + F2(r(t))y'(t) + F3(r(t))z'(t)) dt. PROOF. The unit tangent vector reads T= r'/llr'll and ds = r' dt. Therefore, Ti ds = r'(t) dt. For a smooth curve, r'(t) is continuous on [a, b]. Therefore, by the continuity of the vector field, the function f(t) = F(r(t)) - r'(t) is continuous on [a, b], and the conclusion of the theorem follows from Theorem 14.21. D Equation (15.1) also holds if C is piecewise smooth and F has a finite number of bounded jump discontinuities along C much like in the case of the line integral of ordinary functions. Owing to the repre- sentation (15.1) and the relations dc = z'dt, dy =y'dt, and dz =z'dt, the line integral is often written in the form:  442 15. VECTOR CALCULUS 442 15. VECTOR CALCULUS (15.2) jF. dr fFidx+F2dy+F3dz. For a smooth curve traversed by a vector function r(t), the differential dr(t) is tangent to the curve. In contrast to the line integral of ordinary functions, the line integral of a vector field depends on the orientation of C. The orientation of C is fixed by the conditions r(a) = ra and r(b) = rb for a vector function r(t), where a < t < b, provided the vector function traces out the curve only once. If r(t) traces out C from rb to ra, then the orientation is reversed, and such a curve is denoted by -C. The line integral changes its sign when the orientation of the curve is reversed: (15.3) f(F.dr= fF-dr because the direction of the derivative r'(t) is reversed for all t. If C is piecewise smooth (e.g., the union of smooth curves C1 and C2), then the additivity of the integral should be used: fF.dr= f F-dr+ JF-dr. Line Integral Along a Parametric Curve. A parametric curve is defined by a vector function r(t) on [a, b] (recall Definition 12.4). The vector function r(t) may trace its range (as a point set in space) or some parts of it several times as t changes from a to b. Furthermore, two different vector functions ri(t) and r2(t) on [a, b] may have the same range. For example, r1 = (cost, sin t, 0) and ri(t) = (cos(2t), sin(2t), 0) have the same range on [0, 27], which is the circle of unit radius, but r2(t) traces out the circle twice. The line integral over a parametric curve is defined by the relation (15.1). A parametric curve is much like the trajectory of a particle that can pass through the same points multiple times. So the relation (15.1) defines the work done by a nonconstant force F along a particle's trajectory r = r(t). The evaluation of a line integral includes the following steps: Step 1. If the curve C is defined as a point set in space by some geometrical means, then find its parametric equations r = r(t) that agree with the orientation of C. Here it is useful to remember that, if r(t) corresponds to the orientation opposite to the required one, then it can still be used according to (15.3). Step 2. Restrict the range of t to an interval [a, b] so that C is traced out only once by r(t).  110. LINE INTEGRALS OF A VECTOR FIELD 443 Step 3. Substitute r = r(t) into the arguments of F to obtain the values of F on C and calculate the derivative r'(t) and the dot product F(r(t)) - r'(t). Step 4. Evaluate the (ordinary) integral (15.1). EXAMPLE 15.2. Evaluate the line integral of F = (-y, x, z2) along a closed contour C that consists of two parts. The first part is one turn of a helix of radius R, which winds about the z axis counterclockwise as viewed from the top of the z axis, starting from the point ra = (R, 0, 0) and ending at the point rb = (R, 0, 27h). The second part is a straight line segment from rb to ra. SOLUTION: Let C1 be one turn of the helix and let C2 be the straight line segment. Two line integrals have to be evaluated. The parametric equations of the helix are r(t) = (R cos t, R sin t, ht) so that r(0) (R, 0, 0) and r(27) = (R, 0, h) as required by the orientation of C1. Note the positive signs at cos t and sin t in the parametric equations that are necessary to make the helix winding about the z axis counterclockwise (see Study Problem 12.1). The range of t has to be restricted to [0, 27]. Then r'(t) = (-R sin t, R cost, h). Therefore, F(r(t)) - r'(t) = (-Rsint, R cost, h2t2) - (-Rsint, R cost, h) = R2 + h3t2, /1J F"dr = F(r(t)) - r'(t) dt f (R2 + h3t2) dt C1 00 2R (27h)3 = 27R2+. 3 The parametric equations of the line through two points ra and rb are r(t) = ra + vt, where v = rb - ra is the vector parallel to the line, or, in the components, r = (R, 0, 0) + t(0, 0, 27h) = (R, 0, 2wht). Then r(0) = ra and r(1) = rb so that the orientation is reversed if t E [0, 1]. These parametric equations describe the curve -C2. One has r'(t) = (0, 0, 27h) and hence F(r(t)) - r'(t) = (0, R, (2wh)2t2) . (0, 0, 27h) = (2wh)3t2, fFrdr - F- dr -(2wh)ft2d// (2wh)3 The line integral along C is the sum of these integrals, which is equal to 2wR2.D  444 15. VECTOR CALCULUS 444 15. VECTOR CALCULUS 110.5. Study Problem. Problem 15.1. Find the work done by the force F = (2x, 3y2, 4z3) along any smooth curve originating from the point (0, 0, 0) and ending at the pont (1,1,1). SOLUTION: For any infinitesimal part of the curve, the work is F -dr =2cc dzc + 3y2 dy + 4z3 d_ d 2 + y3 + z4) . If r(t) = (z(t), y(t), z(t)) is a parametric equation of a smooth curve, a < t < b, such that r(a) = (0, 0, 0) and r(b) = (1, 1, 1), then the total work is F dr-= d (x2(t) + y3(t) + z4(t)) - (2(t) + y3(t) + z4(t)) b= 3. C a a 110.6. Exercises. (1) Sketch flow lines of the given planar vector field: (i) F = (ax, by), where a and b are positive (ii) F = (ay, bz), where a and b are positive (iii) F = (ay, bz), where a and b have different signs (iv) F = Vu, u =tan-1(y/z) (v) F = Vu, u = ln[(x2 + y2)-1/2] (vi) F = Vu, u = ln[(x - a)2 + (y - b)2] (2) Sketch flow lines of the given vector field in space: (i) F = (ax, by, cz), where a, b, and c are positive (ii) F = (ax, by, cz), where a and b are positive, while c is negative (iii) F = (y, -x, a), where a is a constant (iv) F = V ||r||, r = (x, y, z) (v) F= V ||r||-', r = (x, y, z) (vi) F = Vu, u = (c/a)2 + (y/b)2 + (z/c)2; (vii) F = Vu, u = c2 + y2 + (z + c)2 + 2 + y2+ (z-c)2, where c is positive (viii) F = a x r, where a is a constant vector and r = (x, y, z) (ix) F = Vu, u = z/ z2 + y2 + z2 (3) A ball rotates at a constant rate w about its diameter parallel to a unit vector n. If the origin of the coordinate system is set at the center of the ball, find the velocity vector field as a function of the position vector r of a point of the ball. (4) Evaluate the line integral fe F -dr for the given vector field F and the specified curve C:  110. LINE INTEGRALS OF A VECTOR FIELD 445 (i) F = (y, xy, 0) and C is the parametric curve r(t) = (t2, t3, 0) for t E [0, 1] (ii) F = (z, yx, zy) and C is the ellipse x2/a2 + y2/b2 = 1 oriented clockwise (iii) F = (z, yx, zy) and C is the parametric curve r(t) = (2t, t + t2, 1 + t3) from the point (-2, 0, 0) to the point (2, 2, 2) (iv) F = (-y, x, z) and C is the boundary of the part of the parab- oloid z = a2 - X2 _ y2 that lies in the first octant; C is oriented counterclockwise as viewed from the top of the z axis (v) F = (-z, 0, x) and C is the boundary of the part of the sphere X2 + y 2 2= a2 that lies in the first octant; C is oriented clockwise as viewed from the top of the z axis (vi) F = a x r, where a is a constant vector and r = (x, y, z); C is the straight line segment from r1 to r2 (vii) F = (y sin z, z sin x, x sin y) and C is the parametric curve r= (cost, sin t, sin(5t)) for t E [0, 7] (viii) F = (y, -zz, y(x2 + Xc2)) and C is the intersection of the cylinder x2 + z2 = 1 with the plane x + y + z = 1 that is oriented counterclockwise as viewed from the top of the y axis (ix) F = (-y sin(wz2), x cos(wz2), exyz) and C is the intersection of the cone z = z2 + y2 and the sphere x2 + y2 + z2 = 2; C is oriented counterclockwise as viewed from the top of the z axis (x) F = (ema, ex, 0) and C is the parabola in the zy plane from the origin to the point (1, 1) (xi) F = (x, y, z) and C is an elliptic helix r(t) = (a cos t, b sin t, ct), 0 < t < 27 (xii) F = (y-1, z-1, x-1) and C is the straight line segment from the point (1, 1, 1) to the point (2, 4, 8) (xiii) F = (ey-z, ez-x, ex-Y) and C is the straight line segment from the origin to the point (1, 3, 5) (xiv) F = (y+z, 2+cc, x+y) and C is the shortest are on the sphere x2 + y2 + z2 = 25 from the point (3, 4, 0) to the point (0, 0, 5) (5) Find the work done by the constant force F in moving a point object along a smooth path from a point ra to a point rb. (6) Find the work done by the force F = f'(r)r/r in moving a point object along a smooth path from a point ra to a point rb, where the derivative f' of f is a continuous function of r = ||r|| (7) Find the work done by the force F =(-y,cc, c), where c is a constant, in moving a point object along: (i) The circle xc2 + y2 =1, z =0 (ii) The circle (xc - 2) + y2 =1, z =0  446 15. VECTOR CALCULUS (8) The force acting on a charged particle that moves in a magnetic field B and an electric field E is F = eE + (e/c)v x B, where v is the velocity of the particle, e is its electric charge, and c is the speed of light in a vacuum. Find the work done by the force along a trajectory originating from a point ra and ending at the point rb if (i) The electric and magnetic fields are constant. (ii) The electric field vanishes, but the magnetic field is a contin- uous function of the position vector, B = B(r). 111. Fundamental Theorem for Line Integrals Recall the fundamental theorem of calculus, which asserts that, if the derivative f'(x) is continuous on an interval [a, b], then f''(x) dz = f (b) - f (a). It appears that there is an analog of this theorem for line integrals. 111.1. Conservative Vector Fields. DEFINITION 15.4. (Conservative Vector Field and Its Potential). A vector field F in a region E is said to be conservative if there is a function f, called a potential of F, such that F =V f in E. Conservative vector fields play a significant role in many practical applications. It has been proved earlier (see Study Problem 13.14) that if a particle moves along a trajectory r = r(t) under the force F = -VU, then its energy E = mv2/2 + U(r), where v = r' is the velocity, is conserved along the trajectory, dE/dt = 0. In particular, Newton's gravitational force is conservative, F = -VU, where U(r) -GMm||r||-1. A static electric field (the Coulomb field) created by a distribution of static electric charges is also conservative. Continuous conservative vector fields have a remarkable property. THEOREM 15.2. (Fundamental Theorem for Line Integrals). Let C be a smooth curve in a region E with initial and terminal points ra and rb, respectively. Let f be a function on £ whose gradient V f is continuous on C. Then (15.4) fVf -dr =f (rb) - f (ra).  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 447 PROOF. Let r = r(t), t E [a, b], be the parametric equations of C such that r(a) = ra and r(b) = rb. Then, by (15.1) and the chain rule, /Vf-dr I=f(f1'Hfy'Hf7f) dt = f (r(t))dt= f(rb)-f (ra). The latter equality holds by the fundamental theorem of calculus and the continuity of the partial derivatives of f and r'(t) for a smooth curve. D 111.2. Path Independence of Line Integrals. DEFINITION 15.5. (Path Independence of Line Integrals). A continuous vector field F has path-independent line integrals if I F-dr= fF-dr for any two simple, piecewise-smooth curves in the domain of F with the same endpoints. Recall that a curve is simple if it does not intersect itself (see Sec- tion 79.3). An important consequence of the fundamental theorem for line integrals is that the work done by a continuous conservative force, F = Vf, is path-independent. So a criterion for a vector field to be con- servative would be advantageous for evaluating line integrals because for a conservative vector field a curve may be deformed at convenience without changing the value of the integral. THEOREM 15.3. (Path-Independent Property). Let F be a continuous vector field on an open region E. Then F has path-independent line integrals if and only if its line integral vanishes along every piecewise-smooth, simple, closed curve C in E. In that case, there exists a function f such that F =Vf: F = Vf <-> F -dr =0. F=Vf jF dr O The symbol jf is often used to denote line integrals along a closed curve. PROOF. Pick a point ro in E and consider any smooth curve C from ro to a point r = (x, y, z) E E. The idea is to prove that the function (15.5) f (r) =fF -dr is a potential of F, that is, to prove that Vf =F under the condition that the line integral of F vanishes for every closed curve in E. This  448 15. VECTOR CALCULUS "guess" for f is motivated by the fundamental theorem for line integrals (15.4), where rb is replaced by a generic point r E E. The potential is defined up to an additive constant (V(f + const) = Vf) so the choice of a fixed point ro is irrelevant. First, note that the value of f is independent of the choice of C. Consider two such curves C1 and C2. Then the union of C1 and -C2 (the curve C2 whose orientation is reversed) is a closed curve, and the line integral along it vanishes by the hypothesis. On the other hand, this line integral is the sum of line integrals along C1 and -C2. By the property (15.3), the line integrals along C1 and C2 coincide. To calculate the derivative f'(r) limh-o(f(r + hei) - f(r))/h, where ei = (1, 0, 0), let us express the difference f(r + hei) - f(r) via a line integral. Note that E is open, which means that a ball of sufficiently small radius centered at any point in E is contained in E (i.e., r + hei E E for a sufficiently small h). Since the value of f is path-independent, for the point r + hei, the curve can be chosen so that it goes from ro to r and then from r to r + hei along the straight line segment. Denote the latter by AC. Therefore, f (r + hei) - f (r) f=FF.dr because the line integral of F from ro to r is path-independent. A vector function that traces out AC is r(t) = (t, y, z) if x < t < x + h. Therefore, r'(t) = ei and F(r(t)) - r'(t) = F1(t, y, z). Thus, 1 x+h 1 x+h f'(r) i im f F1(t, y, z) dt=lim - )F1(ty, z) dt hhoh 0-h a a = aF1(t, y, z) dt = F1(x, y, z) = F1(r) Ozx by the continuity of F1. The equalities f' = F2 and f' = F3 are established similarly. The details are omitted. D Although the path independence property does provide a neces- sary and sufficient condition for a vector field to be conservative, it is rather impractical to verify (one cannot evaluate line integrals along every closed curve!). A more feasible and practical criterion is needed, which is established next. It is worth noting that (15.5) gives a prac- tical method of finding a potential if the vector field is found to be conservative (see Study Problem 15.3). 111.3. The Curl of a Vector Field. According to the rules of vector alge- bra, the product of a vector a =(ai, a2, as) and a number s is defined  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 449 by sa = (sai, sa2, sa3). By analogy, the gradient Vf can be viewed as the product of the vector V = (a/ax, a/ay, a/az) and a scalar f: Sa a f af Of of 8x' B' Bz 8x'ay' az The components of V are not ordinary numbers, but rather they are operators (i.e., symbols standing for a specified operation that has to be carried out). For example, (a/ax)f means that the operator a/ax is applied to a function f and the result of its action on f is the partial derivative of f with respect to x. The directional derivative D"f can be viewed as the result of the action of the operator D~ =i6- V ui(a/ax) + u2(a/ay) + u3(a/az) on a function f. In what follows, the formal vector V is viewed as an operator whose action obeys the rules of vector algebra. DEFINITION 15.6. (Curl of a Vector Field). The curl of a differentiable vector field F is curl F = DVx F. The curl of a vector field is a vector field whose components can be computed according to the definition of the cross product: e1 e2 e3 I a a a VxF~det ~xa z OX aY Oz F1 F2 F3 aF3 8F2F F3 (F§2 a1 e3. = OJ ze1+ O O e2+( X (Jes. (F1z a az 3 ay) When calculating the components of the curl, the product of a com- ponent of V and a component of F means that the component of V operates on the component of F, producing the corresponding partial derivative. EXAMPLE 15.3. Find the curl of the vector field F = (yz, xyz, x2). SOLUTION: e1 e2 e3 VxF=det i1 ax a OX Oy Uz = (x)',- (xyz)'z, -(x2)', + (yz)'z, (xyz)', - (yz)') =(-xy, y - 2x, yz - z).  450 15. VECTOR CALCULUS 450 15. VECTOR CALCULUS The geometrical significance of the curl of a vector field will be discussed in Section 114.4. Here the curl is used to formulate sufficient conditions for a vector field to be conservative. On the Use of the Operator V. The rules of vector algebra are useful to simplify algebraic operations involving the operator V. For example, curlVf = V x (Vf)= (V x V)f =0 because the cross product of a vector with itself vanishes. However, this formal algebraic manipulation should be adopted with precaution because it contains a tacit assumption that the action of the compo- nents of V x V on f vanishes. The latter imposes conditions on the class of functions for which such formal algebraic manipulations are justified. Indeed, according to the definition, ei e2 e3 VxVf=det y z (f 2 - f, f: - fX2fY - fr) f 'O f ' f ' This vector vanishes, provided the order of differentiation does not matter (i.e., Clairaut's theorem holds for f). Thus, the rules of vector algebra can be used to simplify the action of an operator involving V if the partial derivatives of a function on which this operator acts are continuous up to the order determined by that action. 111.4. Test for a Vector Field to Be Conservative. A conservative vector field with continuous partial derivatives in a region E has been shown to have the vanishing curl: F=Vf = curl F=0. Unfortunately, the converse is not true in general. In other words, the vanishing of the curl of a vector field does not guarantee that the vector field is conservative. The converse is true only if the region in which the curl vanishes belongs to a special class. A region E is said to be connected if any two points in it can be connected by a path that lies in E. In other words, a connected region cannot be represented as the union of two or more nonintersecting (disjoint) regions. DEFINITION 15.7. (Simply Connected Region). A connected region £ is simply connected if every simple closed curve in £ can be continuously shrunk to a point in £ while remaining in £ throughout the deformation.  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 451 FIGURE 15.3. From left to right: A planar connected re- gion (any two points in it can be connected by a continuous curve that lies in the region); a planar disconnected region (there are points in it that cannot be connected by a contin- uous curve that lies in the region); a planar simply connected region (every simple closed curve in it can be continuously shrunk to a point in it while remaining in the region through- out the deformation); a planar region that is not simply con- nected (it has holes). Naturally, the entire Euclidean space is simply connected. A ball in space is also simply connected. If E is the region outside a ball, then it is also simply connected. However, if E is obtained by removing a line (or a cylinder) from the entire space, then E is not simply connected. Indeed, take a circle such that the line pierces through the disk bounded by the circle. There is no way this circle can be continuously contracted to a point of E without crossing the line. A solid torus is not simply connected. (Explain why!) A simply connected region D in a plane cannot have "holes" in it. THEOREM 15.4. (Test for a Vector Field to Be Conservative). Suppose F is a vector field whose components have continuous partial derivatives on a simply connected open region E. Then F is conserva- tive in E if and only if its curl vanishes for all points of E: curl F = 0 on simply connected E < F = Vf on E. This theorem follows from Stokes' theorem discussed later in Sec- tion 114 and has two useful consequences. First, the test for the path independence of line integrals: curl F = 0 on simply connected E <> F . dr = F . dr J 1 C2 for any two paths C1 and C2 in E originating from a point ra E E and terminating at another point rb E E. It follows from Theorem 15.3  452 15. VECTOR CALCULUS 452 15. VECTOR CALCULUS for the curve C that is the union of C1 and -C2. Second, the test for vanishing line integrals along closed paths: curl F = 0 on simply connected E < JF dr = 0, where C is a closed curve in E. The condition that E is simply con- nected is crucial here. Even if curlF = 0, but E is not simply con- nected, the line integral of F may still depend on the path and the line integral along a closed path may not vanish! An example is given in Study Problem 15.2. Newton's gravitational force can be written as the gradient F = -VU, where U(r) = -GMm||r||-1 everywhere except the origin. There- fore, its curl vanishes in E, which is the entire space with one point removed; it is simply connected. Hence, the work done by the gravita- tional force is independent of the path traveled by the object and deter- mined by the difference in values of its potential U (also called potential energy) at the initial and terminal points of the path. More generally, since the work done by a force equals the change in kinetic energy (see Section 73.5), the motion under a conservative force F = -VU has the fundamental property that the sum of kinetic and potential energies, mv2/2 + U(r), is conserved along a trajectory of the motion (recall Study Problem 13.14). EXAMPLE 15.4. Evaluate the line integral of the vector field F (F1, F2, F3) = (yz, zz+z+2y, zy+y+2z) along the path C that consists of straight line segments AB1, B1B2, and B2D, where the initial point is A = (0, 0, 0), B1 = (2010, 2011, 2012), B2 = (102, 1102, 2102), and the terminal point is D = (1, 1, 1). SOLUTION: The path looks complicated enough to check whether F is conservative before evaluating the line integral using the parametric equations of C. First, note that the components of F are polynomi- als and hence have continuous partial derivatives in the entire space. Therefore, if its curl vanishes, then F is conservative in the entire space by Theorem 15.4 as the entire space is simply connected: 71 02 03 e1 e2 03 X-,,V-1--(a a a)-A aa __a VxF~det = x~ ~jdetii xz +2 F1 F2F3 y zz~+2yxy+y+2z =(F) - (F2)'z, -(F3)'2 + (F1)'z, (F2)'2 - (F1)',)  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 453 Thus, F is conservative. Now there are two options to finish the prob- lem. Option 1. One can use the path independence of the line integral, which means that one can pick any other path C1 connecting the ini- tial point A and the terminal point D to evaluate the line integral in question. For example, a straight line segment connecting A and D is simple enough to evaluate the line integral. Its parametric equations are r = r(t) = (t, t, t), where t E [0, 1]. Therefore, F(r(t)) - r'(t) = (t2, t2 + 3t, t2 + 3t) - (1, 1, 1) = 3t2 + 6t and hence fF -dr= F dr= (3t2 +6t) dt =4. Option 2. The procedure of Section 89.1 may be used to find a po- tential f of F (see also the study problems at the end of this section for an alternative procedure). The line integral is then found by the fundamental theorem for line integrals. Put Vf = F. Then the prob- lem is reduced to finding f from its first-order partial derivatives (the existence of f has already been established). Following the procedure of Section 89.1, f' = F1 = yz -- f(x, y, z) = xyz + g(y, z), where g(y, z) is arbitrary. The substitution of f into the second equa- tion f'l= F2 yields xz + g'(y, z)=xz + z + 2y - g(y, z) = y2 + zy + h(z), where h(z) is arbitrary. The substitution of f = zyz + y2 + zy + h(z) into the third equation f' = F3 yields xy+y+h'(z)=xy+y+2z - h(z)=z2+c, where c is a constant. Thus, f(x, y, z) = zyz + yz + z2 + y2 + c and JF dr = f(1, 1,1) - f(0,0,0)= 4 by the fundamental theorem for line integrals. D 111.5. Study Problems. Problem 15.2. Verify that  454 15. VECTOR CALCULUS z z (xo, yo, zo) c C2 1 C3 O=0 x z 2=Y=0 FIGURE 15.4. Left: An illustration to Study Problem 15.2. Right: An illustration to Study Problem 15.3. To find the potential of a conservative vector field, one can evaluate its line integral from any point (xo, yo, zo) to a generic point (x, y, z) along the rectangular contour C that is is the union of the straight line segments C1, C2, and C3 parallel to the coordinate axes. and curl F = 0 in the domain of F. Evaluate the line integral of F along the circular path C: x2 +y2 = R2 in the plane z = a. The path is oriented counterclockwise as viewed from the top of the z axis. Does the result contradict the fundamental theorem for line integrals? Explain. SOLUTION: A straightforward differentiation of f shows that indeed Vf = F, and therefore curl F = 0 everywhere except the line x y = 0, where F is not defined. The path C is traced out by r(t) (R cos t, R sin t, a), where t E [0, 27]. Then F(r(t)) = (-R-1 sin t, R-1 cost, 2a) and r'(t) = (-R sin t, R cost, 0). Therefore, F(r(t)) r'(t) = 1 and /27 fF dr fdt=2wf. C o So the integral over the closed contour does not vanish despite the fact that F = Vf, which seems to be in conflict with the fundamental theorem for line integrals as by the latter the integral should have vanished. Consider the values of f along the circle. By construction, f(x, y, a) O(x, y) +a2, where O(x, y) is the polar angle in any plane z = a. It is 0 on the positive x axis and increases as the point moves about the origin. As the point arrives back to the positive x axis, the angle reaches the value 27; that is, f is not really a function on the closed contour because  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 455 it takes two values, 0 and 27, at the same point on the positive x axis. The only way to make f a function is to remove the half-plane 0 = 0 from the domain of f. Think of a cut in space along the half-plane. But, in this case, any closed path that intersects the half-plane becomes nonclosed as it has two distinct endpoints on the opposite edges of the cut. If the fundamental theorem for line integrals is applied to such a path, then no contradiction arises because the values of f on the edges of the cut differ exactly by 27 in full accordance with the conclusion of the theorem. Alternatively, the issue can be analyzed by studying whether F is conservative in its domain E. The vector field is defined everywhere in space except the line x = y = 0 (the z axis). So E is not simply con- nected. Therefore, the condition curlF = 0 is not sufficient to claim that the vector field is conservative on its domain. Indeed, the evalu- ated line integral along the closed path (which cannot be continuously contracted, staying within E, to a point in E) shows that the vector field cannot be conservative on E. If the half-plane 0 = 0 is removed from E, then F is conservative on this "reduced" region because the latter is simply connected. Naturally, the line integral along any closed path that does not cross the half-plane 0 = 0 (i.e., it lies within the reduced domain) vanishes. D Problem 15.3. Prove that if F = (F1, F2, F3) is conservative, then its potential is / x y z f(x, y, z) = F1(t, yo, zo) dt + F2(x, t, zo) dt + F3i, y, t) dt, xo Yo zo where (zo, yo, zo) is any point in the domain of F. Use this equation to find a potential of F from Example 15.4. SOLUTION: In (15.5), assume C consists of three straight line segments, (xo,yo,zo) - (x,yo,zo) - (x,y,zo) - (x,y,z), as depicted in the right panel of Figure 15.4. The parametric equation of the first line C1 is r(t) = (t, yo, zo), where xo < t < x. Therefore, r'(t) = (1, 0, 0) and F(r(t)) - r'(t) = F1(t, yo, zo). So the line integral of F along C1 gives the first term in the above expression for f. Similarly, the second term is the line integral of F along the second line r(t) =(x, t, z0), where yo < t < y, so that r'(t) =(0, 1, 0). The third term is the line integral of F along the third line r(t) =(xc, y, t), where zo t z. In Example 15.4, it was est ablished that F =(F1, F2, F3) =(yz, zz+ z+2y, zcy+ y+2z)  456 15. VECTOR CALCULUS is conservative. For simplicity, choose (x0, yo, zo) = (0, 0, 0). Then f(x,y,z) =_ F1(t, 0, 0) dt +iF2(x, t, 0) dt +JF3(x, y, t) dt 0 0i0 = 0+y2+ (xyz+yz+ z2) = xyz+yz+ z2+y2 which naturally coincides with f found by a different (longer) method. D Problem 15.4. (Operator V in Curvilinear Coordinates). Let the transformation (u, v, w) - (x, y, z) be a change of variables. If ea, e, and e are unit vectors normal to the coordinate surfaces (see exercise 14 in Section 105.4), show that V =|Vu|en +|Vv||e +|Vw|ew In particular, find the V operator in the cylindrical and spherical coor- dinates. SOLUTION: By the chain rule, & _u& &vn &aOw&a Ox Oxu &Ozov &OzOw and similarly for &/&y and a/az. Then a a a V = ei +e2 +e3 o(& onw won. 8 (&v. &v. &v,. 'N = a e1+ e2+ e3) + (xe1+ e2+ e3 6xky B / u \\x B 6z / o + Oe1+ e2+ e3 Vu a+ v +Vw Vu|e + +|Vv|eav +||Vw|ewav, where the unit vectors are defined in (14.23). Making use of (14.24), (14.25), (14.26), and (14.27), the operator V is obtained in the cylin- drical and spherical coordinates: a 1, a +e ~r +-80 +83 a, ,. 1 .a 1 ,. v op p a# psm#~ ao  111. FUNDAMENTAL THEOREM FOR LINE INTEGRALS 457 111.6. Exercises. (1) Calculate the curl of the given vector field: (i) F = (xyz, -y2x, 0) (ii) F = (cos(zz), sin(yz), 2) (iii) F = (h(x), g(y), h(z)), where the functions h, g, and h are differentiable (iv) F = (ln(xyz),ln(yz),lnz) (v) F = a x r, where a is a constant vector and r = (x, y, z) (2) Suppose that a vector field F(r) and a function f(r) are differen- tiable. Show that V x (f F) = f (V x F) + Vf x F. (3) Find V(c x rf(r)), where r = r f is differentiable, and c is a constant vector. (4) A fluid, filling the entire space, rotates at a constant rate w about an axis parallel to a unit vector n. Find the curl of the velocity vector field at a generic point r. Assume that the position vector r originates from a point on the axis of rotation. (5) Determine whether the vector field is conservative and, if it is, find its potential: (i) F = (2xy, x2 + 2yz3, 3z2y2 + 1) (ii) F = (yz, zz + 2y cos z, zy - y2 sin z) (iii) F= (e, zcey - z2, -2yz) (iv) F = (6xy + z4y, 3X2 + z4z, 4z3xy) (v) F = (yz(2x + y + z),zz(x + 2y + z), zy(x + y + 2z)) (vi) F = (-y(x2 + y2)-1 + z, x(x2 + y2)-1, c) (vii) F = (ycos(xy), xccos(xy), z + y) (viii) F = (-yz/z2, z/, y/c) (6) Determine first whether the vector field is conservative and then evaluate the line integral f F- dr: (i) F(x, y, z) = (y2z2 + 2x + 2y, 2xyz2 + 2x, 2xy2z + 1) and C consists of three line segments: (1, 1, 1) - (a, b, c) - (1, 2, 3) (ii) F = (zz, yz, z2) and C is the part of the helix r(t) = (2 sin t, -2 cos t, t) that lies inside the ellipsoid x2 + y2 + 2z2 = 6 (iii) F = (y - z2, c + sin z, y cos z - 2xz) and C is one turn of a helix of radius a from (a, 0, 0) to (a, 0, b). (iv) F = g(r2)r, where r = (x, y, z), r =r, g is continuous, and C is a smooth curve from a point on the sphere 2+y2+z2 =a2 to a point on the sphere x2+ y2 + z2 b 2. What is the work done by the force F if g =-/3 (v) F =(2(y + z)1/2, -cc(y + z)3/2, -ccx -+ z-3/2) and C is a smooth curve from the point (1, 1, 3) and (2, 4, 5)  458 15. VECTOR CALCULUS (7) Suppose that F and G are continuous on E. Show that ja F -"dr jc G - dr for any smooth closed curve C in E if there is a function f with continuous partial derivatives in E such that F - G = Vf. (8) Use the properties of the gradient to show that the vectors er (cos 0, sin 0) and ed = (- sin 0, cos 0) are unit vectors orthogonal to the coordinate curves r(x, y) = const and O(x, y) = const of polar coordinates. Given a planar vector field, put F = Frer +F060. Use the chain rule to express the curl of a planar vector field F(r, 0) in polar coordinates (r, 0) as a linear combination of er and ee. (9) Evaluate the pairwise cross products of the unit vectors (14.27) and the pairwise cross products of the unit vectors (14.26). Use the obtained relations and the result of Study Problem 15.4 to express the curl of a vector field in spherical and cylindrical coordinates: V F p psn ((sini#Fo) BF0 0 1(1 &Fs_ &( pFN!e) - 1 8((pF4) &Fp p 0sin#$ Op} p p &q#} VxF= Iz er+ - r e_ r\ 6O Bz /zk Jr 4 ((rFo) &Fr, +r Or 8 e where the field F is decomposed over the bases (14.27) and (14.26): F = FPep + F~e4 + F0e and F = Frey + F0e0 + Fzez. Hint: Show a eP/a# = ep, & ea/3O= sin 0 e, and similar relations for the partial derivatives of other unit vectors. 112. Green's Theorem Green's theorem should be regarded as the counterpart of the fun- damental theorem of calculus for the double integral. DEFINITION 15.8. (Orientation of Planar Closed Curves). A simple closed curve C in a plane whose single traversal is counter- clockwise (clockwise) is said to be positively (negatively) oriented. A simple closed curve divides the plane into two connected regions. If a planar region D is bounded by a simple closed curve, then the positively oriented boundary of D is denoted by the symbol &D (see the left panel of Figure 15.5). Recall that a simple closed curve can be regarded as a continuous vector function r(t) =(z(t), y(t)) on [a, b] such that r(a) =r(b) and,  112. GREEN'S THEOREM 459 112. GREEN'S THEOREM 459 Y s top&V) C3 D DC4 C2 > __ a Y Ybot (x) FIGURE 15.5. Left: A simple closed planar curve encloses a connected region D in the plane. The positive orientation of the boundary of D means that the boundary curveOD is traversed counterclockwise. Right: A simple region D is bounded by four smooth curves: two graphs C1 and C3 and two vertical lines C2 (x= b) and C4 (x= a). The boundary OD is the union of these curves oriented counterclockwise. for any ti / t2 in the open interval (a, b), r(ti) / r(t2); that is, r(t) traces out C only once without self-intersection. A positive orientation means that r(t) traces out its range counterclockwise. For example, the vector functions r(t) = (cos t, sin t) and r(t) = (cos t, - sin t) on the interval [0, 27] define the positively and negatively oriented circles of unit radius, respectively. THEOREM 15.5. (Green's Theorem). Let C be a positively oriented, piecewise-smooth, simple, closed curve in the plane and let D be the region bounded by C =&OD. If the functions F1 and F2 have continuous partial derivatives in an open region that contains D, then JID(4i it - jiA jid+~ Y ll 8F 8F1d A = O F Dz+Fdy Just like the fundamental theorem of calculus, Green's theorem re- lates the derivatives of F1 and F2 in the integrand to the values of F1 and F2 on the boundary of the integration region. A proof of Green's theorem is rather involved. Here it is limited to the case when the region D is simple. PROOF (FOR SIMPLE REGIONS). A simple region D admits two equiv- alent algebraic descriptions: (15.6) D ={(x, y)| ybot(X) y ytop(x) , X E [a, b]}, (15.7) D ={(x, y) |bot (y) x ztop(y) , y E [c, d].  460 15. VECTOR CALCULUS The idea of the proof is to establish the equalities (15.8) 4Fdz= ff F dA, F2dy= f &F2dA JD D y DD using, respectively, (15.6) and (15.7). The conclusion of the theorem is then obtained by adding these equations. The line integral is transformed into an ordinary integral first. The boundary &D contains four curves, denoted C1, C2, C3, and C4 (see the right panel of Figure 15.5). The curve C1 is the graph y = ybot(x) whose parametric equations are r = (t, ybot(t)), where t E [a, b]. So C1 is traced out from left to right as required by the positive orientation of &D. The curve C3 is the top boundary y = ytop(x), and, similarly, its parametric equations r(t) = (t, ytop(t)), where t E [a, b]. This vector function traverses C3 from left to right. So the orientation of C3 must be reversed to obtain the corresponding part of &D. The boundary curves C2 and C4 (the sides of D) are segments of the vertical lines x = b (oriented upward) and c = a (oriented downward), which may collapse to a single point if the graphs y = ybot(x) and y = ytop(x) intersect at x = a or x = b or both. The line integrals along C2 and C4 do not contribute to the line integral with respect to c along &D because dc = 0 along C2 and C4. By construction, c = t and dc = dt for the curves C1 and C2. Hence, Fld = F1 dz+fJ F1 d J8D JClJC2 -b(F(yo(x) - F(x,ytop(x))) dcc, where the property (15.3) has been used. Next, the double integral is transformed into an ordinary integral by converting it to an iterated integral: // 8F1 /'b ftop(x) OF1 J FdA= dydz D 0y a ybotcx) y F(x, ytop(x)) - F(x,ybot(x)))dc, where the latter equality follows from the fundamental theorem of calculus and the continuity of F1 on an open interval that contains [Ybot(cc), ytop(cc)] for any cc E [a, b] (the hypothesis of Green's theorem). Comparing the expression of the line and double integrals via ordinary integrals, the validity of the first relation in (15.8) is established. The  112. GREEN'S THEOREM 461 D D -C CD C FIGURE 15.6. Left: A region D is split into two regions by a curve C. If the boundary of the upper part of D has positive orientation, then the positively oriented boundary of the lower part of D has the curve -C. Right: Green's theo- rem holds for nonsimply connected regions. The orientation of the boundaries of "holes" in D is obtained by making cuts along curves C1 and C2 so that D becomes simply connected. The positive orientation of the outer boundary of D induces the orientation of the boundaries of the "holes." second equality in (15.8) is proved analogously by using (15.7). The details are omitted. Q Suppose that a smooth, oriented curve C divides a region D into two simple regions D1 and D2 (see the left panel of Figure 15.6). If the boundary &D1 contains C (i.e., the orientation of C coincides with the positive orientation of &D1), then &D2 must contain the curve -C and vice versa. Using the conventional notation F1 dx + F2 dy = F . dr, where F = (F1, F2), one infers that F~dr= F~dr+ F dr J89D J89Di J8D2 J1 FF2&F1dA IL 2 - OF1)dA 0 OFF )F dA. The first equality holds because of the cancellation of the line inte- grals along C and -C according to (15.3). The validity of the second equality follows from the proof of Green's theorem for simple regions. Finally, the equality is established by the additivity property of dou- ble integrals. By making use of similar arguments, the proof can be extended to a region D that can be represented as the union of a finite number of simple regions.  462 15. VECTOR CALCULUS Green's Theorem for Nonsimply Connected Regions. Let the regions D1 and D2 be bounded by simple, piecewise-smooth, closed curves and let D2 lie in the interior of D1 (see the right panel of Figure 15.6). Consider the region D that was obtained from D1 by removing D2 (the region D has a hole of the shape D2). Making use of Green's theorem, one finds Il 8F2 8F1 F2 8F F2 8F1 dA=_ _- ldA-fdA J=JFdr- F-dr jDif JD2F = F -dr+ F-dr J Di J -D2 (15.9) =JFD - dr. J8D This establishes the validity of Green's theorem for not simply con- nected regions. The boundary &D consists of &D1 and -&D2; that is, the outer boundary has a positive orientation, while the inner boundary is negatively oriented. A similar line of reasoning leads to the conclu- sion that Green's theorem holds for any number of holes in D: all inner boundaries of D must be negatively oriented. Such orientation of the boundaries can also be understood as follows. Let a curve C connect a point of the outer boundary with a point of the inner boundary. Let us make a cut of the region D along C. Then the region D becomes simply connected, and &D consists of a continuous curve (the inner and outer boundaries, and the curves C and -C). The boundary &D can always be positively oriented. The latter requires that the outer boundary be traced counterclockwise, while the inner boundary is traced clock- wise (the orientation of C and -C is chosen accordingly). By applying Green's theorem to &D, one can see that the line integrals over C and -C are cancelled and (15.9) follows from the additivity of the double integral. 112.1. Evaluating Line Integrals via Double Integrals. Green's theorem provides a technically convenient tool to evaluate line integrals along planar closed curves. It is especially beneficial when the curve consists of several smooth pieces that are defined by different vector functions; that is, the line integral must be split into a sum of line integrals to be converted into ordinary integrals. Sometimes, the line integral turns out to be much more difficult to evaluate than the double integral. EXAMPLE 15.5. Evaluate the line integral of F =(y2 + ecos, 3xy - sin(y4)) along the curve C that is the boundary of the half of the ring: 1 O; C is oriented clockwise.  112. GREEN'S THEOREM 463 2 P+ D i Pn C Pi -2--1 1 2 p FIGURE 15.7. Left: The integration curve in the line inte- gral discussed in Example 15.5. Right: A general polygon. Its area is evaluated in Example 15.7 by representing the area via a line integral. The curve C consists of four smooth pieces, the half-circles of radii 1 and 2 and two straight line segments of the x axis, [-2, -1] and [1, 2], as shown in the left panel of Figure 15.7. Each curve can be easily param- eterized, and the line integral in question can be transformed into the sum of four ordinary integrals, which are then evaluated. The reader is advised to pursue this avenue to appreciate the following alternative based on Green's theorem (this is not impossible to accomplish if one figures out how to handle the integration of the functions cs x and sin(y4) whose antiderivatives are not expressible in elementary func- tions). SOLUTION: The curve C is a simple, piecewise-smooth, closed curve, and the components of F have continuous partial derivatives every- where. Thus, Green's theorem applies if OD = -C (because the orien- tation of C is negative) and D is the half-ring. One has OF1 /y = 2y and &F2/0x = 3y. By Green's theorem, F.dr - DFD dr=- I f( - dA=- ydA ff2 [iF [21 - r sin a r dr d - - sin0 d r2dr = 4 0 1 0 13 where the double integral has been transformed to polar coordinates. The region D is the image of the rectangle D' = [1, 2] x [0, r] in the polar plane under the transformation (r, 0) -- (x, y). Q Changing the Integration Curve in a Line Integral. If a planar vector field is not conservative, then its line integral along a curve C originating from a point A and terminating at a point B depends on C. If C' is another curve outgoing from A and terminating at B, what is the relation between the line integrals of F over C and C'? Green's theorem  464 15. VECTOR CALCULUS allows us to establish such a relation. Suppose that C and C' have no self-intersections and do not intersect. Then their union is a boundary of a simply connected region D. Let us reverse the orientation of one of the curves so that their union is the positively oriented boundary &D, where &D is the union of C and -C'. Then dF dr+I F dr = fF dr -f F dr. By Green's theorem, (15.10) F-dr Fdr +F-daA , which establishes the relation between line integrals of a nonconser- vative planar vector field over two different curves that have common endpoints. EXAMPLE 15.6. Evaluate the line integral of the vector field F (2y + cos(x2), 2 + y3) over the curve C, which consists of the line segments (0, 0) - (1, 1) and (1, 1) - (0, 2). SOLUTION: Let C' be the line segment (0, 0) - (0, 2). Then the union of C and -C' is the boundary &D (positively oriented) of the triangular region D with vertices (0, 0), (1, 1), and (0, 2). The relation (15.10) can be applied to evaluate the line integral over C. The parametric equations of C' are x= 0, y = t, 0 < t < 2. Hence, along C', F - dr = F2(0,t) dt = t3 dt and fF. dr - t3dt -4. Then &F2/&x = 2x and &F1/&y = 2. The region D admits an algebraic description as a vertically simple region: x y < 2 - c, 0 < x < 1. Hence, Jj F Fi)dA f (2x-2)dA=2 (x-1)2 y =-4 (X-1)2d= . Therefore, by (15.10), fF -dr =4+ - - C 33  112. GREEN'S THEOREM 465 112. GREEN'S THEOREM 465 112.2. Area of a Planar Region as a Line Integral. Put F2 = x and F1= 0. Then Jj(&F2- Fl dA= jdA=A(D). The area A(D) can also be obtained if F = (-y, 0) or F = (-y/2, /2). By Green's theorem, the area of D can be expressed by line integrals: (15.11) A(D)_j x dy - y dx=- x dy - y dx, 8D faD 21aD assuming, of course, that the boundary of D is a simple, piecewise- smooth, closed curve (or several such curves if D has holes). The reason the values of these line integrals coincide is simple. The difference of any two vector fields involved is the gradient of a function whose line integral along a closed curve vanishes owing to the fundamental theorem for line integrals. For example, for F = (0, x) and G(-y, 0), the difference is F - G = (y, x) =Vf, where f (x, y) = zy, so that F -dr - G -dr = (F -G)-dr = Vf .dr =0. J18D IJI9D JCBD JlD The representation (15.11) of the area of a planar region as the line integral along its boundary is quite useful when the shape of D is too complicated to be computed using a double integral (e.g., when D is not simple and/or a representation of boundaries of D by graphs becomes technically difficult). EXAMPLE 15.7. (Area of a Polygon). Consider an arbitrary polygon whose vertices in counterclockwise order are (zi, y1), (x2, y2), ..., (zc,,ya). Find its area. SOLUTION: Evidently, a generic polygon is not a simple region (e.g., it may have a starlike shape). So the double integral is not at all suitable for finding the area. In contrast, the line integral approach seems far more feasible as the boundary of the polygon consists of n straight line segments connecting neighboring vertices as shown in the right panel of Figure 15.7. If Ci is such a segment oriented from (xi, yZ) to (x+1, yi+1) for i = 1, 2, ..., n - 1, then C, goes from (x, y ) to (z1, y1). A vector function that traces out a straight line segment from a point ra to a point rb is r(t) = ra + (rb - ra)t, where 0 < t < 1. For the segment Ci, take ra = (ziCyj) and rb = (Xi+1iyi+1). Hence, x(t) =xi-(zi+1-xi)t ze + Acz, t and y(t) =ye + (yi+1 - yi)t =ye + Ay, t. For the vector field F =(-y, cz) on Ci, one has F(r(t)) - r'(t) =(-y(t), x(t)) - (Azj, ZAys) c= zi yi - yi Acci = ziyi+1 - yixi+1;  466 15. VECTOR CALCULUS that is, the t dependence cancels out. Therefore, taking into account that Cn goes from (xn, yn) to (zi, yi), the area is 1 1 A - xdy-ydx=- xdy-ydx (ciyi+i - yJcii+1) dt + f(cnyl - Yn§1) dt n-1 (2iyi+1 - yici+1) + (nyl - YnX)). i=1 So Green's theorem offers an elegant way to find the area of a gen- eral polygon if the coordinates of its vertices are known. A simple, piecewise-smooth, closed curve C in a plane can always be approxi- mated by a polygon. The area of the region enclosed by C can therefore be approximated by the area of a polygon with a large enough number of vertices, which is often used in many practical applications. 112.3. The Test for Planar Vector Fields to Be Conservative. Green's the- orem can be used to prove Theorem 15.4 for planar vector fields. Con- sider a planar vector field F = (F1(x, y), F2(x, y), 0). Its curl has only one component: ei e2 es _ a a (&]F2 _F1 V x F = det (= e3 .ccO&F xi ,J* F1(x y)F2(cc,y) 0 Suppose that the curl of F vanishes throughout a simply connected open region D, V x F = 0. By definition, any simple closed curve C in a simply connected region D can be shrunk to a point of D while remaining in D throughout the deformation (i.e., any such C bounds a subregion Ds of D). By Green's theorem, where C = OD, jF-dr =f j - jdA=ff0 dA=0 for any closed simple curve C in D. By the pathindependence property, the vector field F is conservative in D.  112. GREEN'S THEOREM 467 yyz C 1 --------------------- x = y2 Ca D D FIGURE 15.8. Left: An illustration to Study Problem 15.5. Right: An illustration to Study Problem 15.6. The region Da is bounded by a curve C and the circle Ca. 112.4. Study Problems. Problem 15.5. Evaluate the line integral of F = (y+eX2, 3x-sin(y2)) along the counterclockwise-oriented boundary of D that is enclosed by the parabolas y = x2 and x = y2. SOLUTION: One has OF1/Oy = 1 and &F2/&x = 3. By Green's theorem, F~dr 2dA=2 dydx=2 -x2)dx=-. 8D D 0 x2 s0 The integration region D is shown in the left panel of Figure 5.8. Q Problem 15.6. Prove that the line integral of the planar vector field F - x2 + y2' x2 + y2) along any positively oriented, simple, smooth, closed curve C that en- circles the origin is 2r and that it vanishes for any such curve that does not encircle the origin. SOLUTION: It has been established (see Study Problem 15.2) that the curl of this vector field vanishes in the domain that is the entire plane with the origin removed. If C does not encircle the origin, then OF2/8x - &F1/&y = 0 throughout the region encircled by C, and the line integral along C vanishes by Green's theorem. Given a closed curve C that encircles the origin, but does not go through it, one can always find a disk of a small enough radius a such that the curve C does not intersect it. Let Da be the region bounded by the circle Ca of radius a and the curve C. Then &F2/&x - &F1/&y = 0 throughout Da. Let C be oriented counterclockwise, while Ca is oriented clockwise. Then  468 15. VECTOR CALCULUS &Da is the union of C and Ca. By Green's theorem, Fdr=0 - Fdr=-j Fdr= F-dr=2 J DJC C Ia J-Ca because -Ca is the circle oriented counterclockwise and for such a circle the line integral has been found to be 27 (see Study Problem 15.2). D Problem 15.7. (Volume of Axially Symmetric Solids). Let D be a region in the upper part of the xy plane (y > 0). Consider the solid E obtained by rotation of D about the x axis. Show that the volume of the solid is given by V(E) j y2 d. J8D SOLUTION: Let dA be the area of a partition element of D that contains a point (x, y). If the partition element is rotated about the x axis, the point (x, y) traverses the circle of radius y (the distance from the point (x, y) to the x axis). The length of the circle is 27y. Consequently, the volume of the solid ring swept by the partition element is dV = 2wy dA. Taking the sum over the partition of D, the volume is expressed via the double integral over D: V(E) = 27ffydA. J D In Green's theorem, put &F1/&y = 2y and &F2/3x = 0 so that the above double integral is proportional to the left side of Green's equa- tion. In particular, F1 = y2 and F2 = 0 satisfy these conditions. By Green's theorem, V(E)= f/ &dA = -J F1idx=-7 y2 dx //F 1 8g1aD f8D as required. D 112.5. Exercises. (1) Evaluate the line integral by two methods: (a) directly and (b) using Green's theorem: (i) fJ Cy2 dx - y2 x dy, where C is the triangle with vertices (0, 0), (1, 0), and (1, 2); C is oriented counterclockwise (ii) ke 2yx dzc + cc2 dy, where C consists of the line segments from (0, 1) to (0, 0) and from (0, 0) to (1, 0) and the parabola y 1 -cc2 from (1, 0) to (0, 1) (2) Evaluate the line integral using Green's theorem:  112. GREEN'S THEOREM 469 112. GREEN'S THEOREM 469 (i) fcx sin(x2) dc+(x2y2-c8) dy, where C is the positively oriented boundary of the region between two circles x2 + y2 = 1 and x2 + y2=4 (ii) fj(Y3 d - x3 dy), where C is the positively oriented circle x2 + y2 = a2 (iii) f( cc + y3) dcc + (2 + fy) dy, where C consists of the are of the curve y = cos x from (-7/2, 0) to (7/2, 0) and the line segment from (7/2, 0) to (-7/2, 0) (iv) fC(y4 - ln(x2 + y2)) dc + 2 tan-1(y/) dy, where C is the pos- itively oriented circle of radius a > 0 with center (co, yo) such that co > a and yo > a (v) J(x + y)2d - (X2 + y2) dy, where C is a positively oriented triangle with vertices (1, 1), (3, 2), and (2, 5) (vi) f ccy2 dc - x2y dy, where C is the negatively oriented circle x2 + y2 = a2 (vii) fc(x + y) dc - (x - y) dy, where C is the positively oriented ellipse (c/a)2 + (y/b)2 1 (viii) fj e[(1 - cos y) dc - (y - sin y) dy], where C is the positively oriented boundary of the region 0 < 0 (vi) D is bounded by one loop of the curve x3_+ y3 = 3axy, a > 0 (Hint: Put y = tx.) (vii) D is bounded by the curve (x2 +y2)2 = a2(x2 _ y2) (Hint: Put y = xtant.) (viii) D is bounded by (x/a)" + (y/b)< = 1, n > 0 (Hint: x = a cosn/2 t, y = b sinn/2t.) (6) Let a curve C have fixed endpoints. Under what condition on the function g(x, y) is the line integral fc g(x, y) (y dx + x dy) independent of C? (7) Let D be a planar region bounded by a simple closed curve. If A is the area of D, show that the coordinates (xe, yc) of the centroid of D are c 2A D 2A1D Hint: Use an approach similar to the derivation of (15.11). (8) Let a lamina with a constant surface mass density a occupy a planar region D enclosed by a simple piecewise smooth curve. Show that its moments of inertia about the x and y axes are 31 JD3 D Hint: Use an approach similar to the derivation of (15.11). 113. Flux of a Vector Field The idea of a flux of a vector field stems from an engineering prob- lem of mass transfer across a surface. Suppose there is a flow of a fluid or gas with a constant velocity v and a constant mass density a (mass per unit volume). Let AA be a planar area element placed into the flow. At what rate is the fluid or gas carried by the flow across the area AA? In other words, what is the mass of fluid transferred across AA per unit time? This quantity is called a flux of the mass flow across the area AA. Suppose first that the mass flow is normal to the area element. Consider the cylinder with an axis parallel to v with cross-sectional  113. FLUX OF A VECTOR FIELD 471 vn v FIGURE 15.9. Left: A mass transferred by a homogeneous mass flow with a constant velocity v across an area element AA in time At is Am = u AV, where AV = h AA is the vol- ume of the cylinder with cross-sectional area AA and height h = At vn; vn is the scalar projection of v onto the normal n. Right: A partition of a smooth surface S by elements Si. If r2 is a sample point in Si, ni is the unit normal to S at r>, and ASZ is the surface of the partition element, then the flux of a continuous vector field F(r) across Si is approximated by 0 0). Since S is oriented downward, n = (g', g', -1) (-2x, -2y, -1) and the normal component of F is Fn(x, y) = (cg, yg, g) - (-2x, -2y, -1) = -(1 - x2 - 22)(1 + 2x2 + 2y2). Converting the double integral of F to polar coordinates, Q =fF(, y) dA =f(1- 2(1 r2(12r2) rdr dO The negative value of the downward flux means that the actual transfer of a quantity (like a mass), whose flow is described by the vector field F, occurs in the upward direction across S. D 113.4. Parametric Surfaces. If the surface S in the flux integral is de- fined by the parametric equations r = r(u, v), where (u, v) E D, then, by Theorem 14.23, the normal vector to S is n = r' x r', (or -n; the sign is chosen according to the geometrical description of the orienta- tion of S). Since ||n||I= J, where J determines the area transformation law dS= J dA (dA = du dv), the flux of a vector field F across the surface area dS reads F(r(u, v)) - n dS = F(r(u, v)) - n dA = F(r(u, v)) - (r'U x r',) dA = F,(u,v) dA, and the flux is given by the double integral Jj=F. -ndS JJF(r(u,v)) - (r', x r',)dA ffF(u,v)d A.  478 15. VECTOR CALCULUS Naturally, a graph z = g(x, y) is described by the parametric equations r(u, v) = (u, v, g(u, v)), which is a particular case of the above expres- sion; it coincides with that given in Theorem 15.6 (x = u and y = v). A description of surfaces by parametric equations is especially conve- nient for closed surfaces (i.e., when the surface cannot be represented as a graph of a single function). EXAMPLE 15.9. Evaluate the outward flux of the vector field F = (z2x, z2y, z3) across the sphere of unit radius centered at the origin. SOLUTION: The parametric equations of the sphere of radius R = 1 are given in (14.36), and the normal vector is computed in Example 14.41: n = sin(u)r(u, v), where r(u, v) = (cos vsin u, sin v sin u, cos u) and (u, v) E D = [0,w7] x [0, 2w]; it is an outward normal because sin u;> 0. It is convenient to represent F = z2r so that F (u, v) = F(r(u, v)) - n = cos2 u sin u r(u, v) - r(u, v) = cos2 usin u|r(u, v)||2 =cos2u sin u because ||r(u, v)||2 = R2 = 1. The outward flux reads o Ada BoSa a->o Aa Sa Indeed, since the function f(r) = B . n is continuous on Sa, there is a point ra E Sa such that the surface integral of f equals ASa f(ra). As a - 0, ra -- ro and, by the continuity of f, f(ra) -- f(ro). Thus, the circulation of a vector field per unit area is maximal if the normal to the area element is in the same direction as the curl of the vector field, and the maximal circulation equals the magnitude of the curl. This observation has the following mechanical interpretation illus- trated in the left panel of Figure 15.15. Let F describe a fluid flow F = v, where v is the fluid velocity vector field. Imagine a tiny paddle wheel in the fluid at a point ro whose axis of rotation is directed along  114. STOKES' THEOREM 487 114. STOKES' THEOREM 487 n. The fluid exerts pressure on the paddles, causing the paddle wheel to rotate. The work done by the pressure force is determined by the line integral along the loop &Sa through the paddles. The more work done by the pressure force, the faster the wheel rotates. The wheel rotates fastest (maximal work) when its axis n is parallel to curl v because, in this case, the normal component of the curl, (V x v) - n = ||V x vll, is maximal. For this reason, the curl is often called the rotation of a vector field and also denoted as rot F = V x F. DEFINITION 15.11. (Rotational Vector Field). A vector field F that can be represented as the curl of another vector field A, that is, F = V x A, is called a rotational vector field. The following theorem holds (the proof is omitted). THEOREM 15.8. (Helmholtz's Theorem). Let F be a vector field on a bounded domain E whose components have continuous second-order partial derivatives. Then F can be decomposed into the sum of conservative and a rotational vector fields; that is, there is a function f and a vector field A such that F=Vf+V x A. The vector field A is called a vector potential of the field F. The vector potential is not unique. It can be changed by adding the gradient of a function, A -- A + Vg, because V x (A+Vg) =V x A + V x (Vg) =V x A for any g that has continuous second-order partial derivatives. Elec- tromagnetic waves are rotational components of electromagnetic fields, while the Coulomb field created by static charges is conservative. The velocity vector field of an incompressible fluid (like water) is a rota- tional vector field. 114.5. Test for a Vector Field to Be Conservative. The test for a vector field to be conservative (Theorem 15.4) follows from Stokes' theorem. Indeed, in a simply connected region E, any simple closed curve can be shrunk to a point while remaining in E throughout the deformation. Therefore, for any such curve C, one can always find a surface S in E such that &S =C (e.g., C can be shrunk to a point along such S). If curl F =0 throughout E, then, by Stokes' theorem, jF -dr =fcurl F - ndS =O  488 15. VECTOR CALCULUS for any simple closed curve C in E. By the path independence property, F is conservative. The hypothesis that E is simply connected is crucial. For example, if E is the entire space with the z axis removed (see Study Problem 15.2), then the z axis always pierces through any surface S bounded by a closed simple curve encircling the z axis, and one cannot claim that the curl vanishes everywhere on S. 114.6. Study Problem. Problem 15.8. Prove that the flux of a continuous rotational vec- tor field F vanishes across any smooth, closed, and orientable surface. What can be said about a flux in a flow of an incompressible fluid? SOLUTION: A continuous rotational vector field can be written as the curl of a vector field A whose components have continuous partial derivatives, F = V x A. Consider a smooth closed simple contour C in a surface S. It cuts S into two pieces Si and S2. Suppose that S is oriented outward. Then the induced orientations of the boundaries &S1 and &S2 are opposite: &S1 = -&S2. The latter also holds if S is oriented inward. By virtue of Stokes' theorem, ff(V x A) -n dS ff(V x A) -ndS+f (V x A) -ndS = A-dr+ A dr = A -dr + A -dr =O. Recall that the line integral changes its sign when the orientation of the curve is reversed. Since the flow of an incompressible fluid is described by a rotational vector field, the flux across a closed surface always vanishes in such a flow. D 114.7. Exercises. (1) Verify Stokes' theorem for the given vector field F and surface S by calculating the circulation of F along &S and the flux of V x F across S: (i) F = (y, -x, z) and S is the part of the sphere x2+y2+z2 = 2 that lies above the plane z =1 (ii) F =(xc, y,czyz) and S is the part of the plane 2x + y+z =4 in the first octant (iii) F =(y, z,xz) and S is the part of the plane xv+y+z =0 inside the sphere xc2 + y2 + z2 =a2  114. STOKES' THEOREM 489 114. STOKES' THEOREM 489 (2) Use Stokes' theorem to evaluate the line integral of the vector field F along the specified closed contour C: (i) F = (x + y2, y + z2, z + x2) and C is the triangle traversed as (1,0,0) - (0,1, 0) -- (0,0,1) -- (1,0,0) (ii) F = (yz, 2xz, exY) and C is the intersection of the cylinder x2+ y2 = 1 and the plane z = 3 oriented clockwise as viewed from above (iii) F = (xy, 3z, 3y) and C is the intersection of the plane x+ y = 1 and the cylinder y2 + z2 1 (iv) F = (z, y2, 2x) and C is the intersection of the plane x+y+z 5 and the cylinder x2 + y2 = 1; the contour C is oriented counterclockwise as viewed from the top of the z axis (v) F = (-yz, zz, 0) and C is the intersection of the hyperbolic paraboloid z =2 -cX2 and the cylinder x2+ y2 = 1; C is oriented clockwise as viewed from the top of the z axis (vi) F = (z2y/2, -z2x/2, 0) and C is the boundary of the part of the cone z = 1 - c/2 + y2 that lies in the first quadrant; C is oriented counterclockwise as viewed from the top of the z axis. (vii) F = (y - z, -x, c) and C is the intersection of the cylinder x2+ y2 = 1 and the paraboloid z = x2+ (y -1)2; C is oriented counterclockwise as viewed from the top of the z axis (viii) F = (y-z, z - x, xc- y) and C is the ellipse x2 + y2 = a2, (c/a) + (z/b) = 1, a > 0, b > 0, oriented positively when viewed from the top of the z axis (ix) F = (y + z, z + x, c + y) and C is the ellipse c = a sin2t, y = 2a sin t cos t, z = a cos2t, 0 < t < 7, oriented in the direction of increasing t (x) F = (y2 - z2, z2 - X2, X2 - y2) and C is the intersection of the surface of the cube [0, a] x [0, a] x [0, a] by the plane x + y + z 3a/2, oriented counterclockwise when viewed from the top of the c axis (xi) F = (y2z2, x2z2, X2y2), where C is the closed curve traced out by the vector function r(t) = (a cos t, a cos(2t), a cos(3t)) in the direction of increasing t (3) Let C be a closed curve in the plane n - r = d that bounds a region of area A. Find j(n x r) -dr. (4) Use Stokes' theorem to find the work done by the force F in moving a particle along the specified closed path C:  490 15. VECTOR CALCULUS 490 15. VECTOR CALCULUS (i) F = (-yz, zx, yz) and C is the triangle (0, 0, 6) -- (2, 0, 0) - (0, 3, 0) -- (0, 0, 6) (ii) F = (-yz, xz, z2) and C is the boundary of the part of the paraboloid z = 1 - x2 - y2 in the first octant that is traversed clockwise as viewed from the top of the z axis (iii) F = (y + sin x, z2 + cos y, x3) and C is traversed by r(t) (sin t, cost, sin(2t)) for 0 < t < 47 (Hint: Observe that C lies in the surface z = 2xy.) (5) Find the line integral of F = (ex2 - yz, e - xz, z2 - xy) along C, which is the helix x = a cos t, y = a sin t, z = ht/(27) from the point (a, 0, 0) to the point (a, 0, h). Hint: Supplement C by the straight line segment BA to make a closed curve and then use Stokes' theorem. (6) Suppose that a surface S satisfies the hypotheses of Stokes' the- orem and the functions f and g have continuous partial derivatives. Show that (fVg) - dr =ff(Vf x Vg) . ndS. Use the result to show that the circulation of the vector fields of the form F= f Vf and F= f Vg + g Vf vanishes along aS. (7) Consider a rotationally symmetric solid. Let the solid be rotating about the symmetry axis at a constant rate w (angular velocity). Let w be the vector parallel to the symmetry axis such that |wll= w and the rotation is counterclockwise as viewed from the top of w. If the origin is on the symmetry axis, show that the linear velocity vector field in the solid is given by v = w x r, where r is the position vector of a point in the solid. Next, show that V x v = 2w. This gives another relation between the curl of a vector field and rotations. 115. Gauss-Ostrogradsky (Divergence) Theorem 115.1. Divergence of a Vector Field. DEFINITION 15.12. (Divergence of a Vector Field). Suppose that a vector field F = (F1, F2, F3) is differentiable. Then the scalar function &F1 &F2 &F3 divF=V-F= + + &cx &y 6z is called the divergence of a vector field. E XAMPLE 15.12. Find the divergence of the vector field F= (xc3 + cos(yz), y + sin(xc2zcy)  115. GAUSS-OSTROGRADSKY (DIVERGENCE) THEOREM 491 SOLUTION: One has div F = (x3 + cos(yz)), + (y + sin(x2z))' + (xyz)' = 3x2 + 1 + ycc. COROLLARY 15.1. A rotational vector field whose components have continuous partial derivatives is divergence free, div curl A = 0. PROOF. By definition, a rotational vector field has the form F curl A = V x A, where the components of A have continuous second- order partial derivatives because, by the hypothesis, the components of F have continuous first-order partial derivatives. Therefore, div F = div curl A = V - curl A = V - (V x A) = 0 by the rules of vector algebra (the triple product vanishes if any two vectors in it coincide). These rules are applicable because the compo- nents of A have continuous second-order partial derivatives (Clairaut's theorem holds for its components; see Section 111.4). Q Laplace Operator. Let F = Vf. Then div F = VVf = f+f+fz The operator V.- V = V2 is called the Laplace operator. 115.2. Another Vector Form of Green's Theorem. Green's theorem re- lates a line integral along a closed curve of the tangential component of a planar vector field to the flux of the curl across the region bounded by the curve. Let us investigate the line integral of the normal com- ponent. If the vector function r(t) = (z(t), y(t)), a < t < b, traces out the boundary C of D in the positive (counterclockwise) direction, then T(t) =1 ( '(t), y'(t) , nMt) = , (y'(t), -'(t)) , are the unit tangent vector and the outward unit normal vector to the curve C, respectively. Consider the line integral kC F-n ds of the normal component of a planar vector field along C. One has ds =|r'(t)|dt, and hence F. -nds= Fy dt - F2x'dt= Fdy - F2d= G dr, where G =(-F2, F1). By Green's theorem applied to the line integral of the vector field G, fF-nds j/G-dr=i% &1dA= J(F+%F2dA.  492 15. VECTOR CALCULUS 492 15. VECTOR CALCULUS The integrand in the double integral is the divergence of F. Thus, another vector form of Green's theorem has been obtained: F. -nds f= div F dA. J8D JD For a planar vector field (think of a mass flow on a plane), the line integral on the left side can be viewed as the outward flux of F across the boundary of a region D (e.g., the mass transfer by a planar flow across the boundary of D). An extension of this form of Green's the- orem to three-dimensional vector fields is known as the divergence, or Gauss- Ostrogradsky, theorem. 115.3. The Divergence Theorem. Let a solid region E be bounded by a closed surface S. If the surface is oriented outward (the normal vector points outside of E), then it is denoted S =&3E. THEOREM 15.9. (Gauss-Ostrogradsky (Divergence) Theorem). Suppose E is a bounded, closed region in space that has a piecewise- smooth boundary S =&3E oriented outward. If components of a vector field F have continuous partial derivatives in an open region that con- tains E, then lI F -ndS =f divFdV. The divergence theorem states that the outward flux of a vector field across a closed surface S is given by the triple integral of the divergence of the vector field over the solid region bounded by S. It provides a convenient technical tool to evaluate the flux of a vector field across a closed surface. Remark. It should be noted that the boundary &E may contain several disjoint pieces. For example, let E be a solid region with a cavity. Then &E consists of two pieces, the outer boundary and the cavity boundary. Both pieces are oriented outward in the divergence theorem. EXAMPLE 15.13. Evaluate the flux of the vector field F (4xy2z + ez, 4yc2z, z4 + sin(xy)) across the closed surface oriented out- ward that is the boundary of the part of the ball x2+r y2 + z2 <; R2 in the first octant (xc, y, z ;> 0). SoLUTIoN: The divergence of the vector field is div F =(4xcy2z + ez)', + (4yz2z)', + (z4 + sin~ixy))'z 4z~ix2 + y2 + z2).  115. GAUSS-OSTROGRADSKY (DIVERGENCE) THEOREM 493 By the divergence theorem, FF.ndS =fff4z(x2+y2+z2)dV /7r/2 7r/2 R 7Rg6 [p7 f72 4p3 cos tp2 sin tdp dO d 2 o o o R24 where the triple integral has been converted to spherical coordinates. The reader is advised to evaluate the flux without using the divergence theorem to appreciate the power of the latter! D The divergence theorem can be used to change (simplify) the surface in the flux integral. COROLLARY 15.2. Let the boundary &E of a solid region E be the union of two surfaces S1 and S2. Suppose that all the hypotheses of the divergence theorem hold. Then F FndS =fffdivFdV - ff -F.d. This establishes a relation between the flux across Si and the flux across S2 with a common boundary curve (see Figure 15.15, right panel). Indeed, since &E is the union of two disjoint pieces Si and S2, the surface integral over &E is the sum of the integrals over Si and S2. On the other hand, the integral over OE can be expressed as a triple integral by the divergence theorem, which establishes the stated relation between the fluxes across Si and S2. Note that Si and S2 must be oriented so that their union is &E; that is, it has outward orientation. EXAMPLE 15.14. Evaluate the upward flux of the vector field F = (z2tan1(y2+1), z4 ln(x2 +1), z) across the part of the paraboloid z = 2 - x2 - y2 that lies above the plane z = 1. SOLUTION: Consider a solid E bounded by the paraboloid and the plane z = 1. Let S2 be the part of the paraboloid that bounds E and let Si be the part of the plane z = 1 that bounds E. If S2 is oriented upward and Si is oriented downward, then the boundary of £ is oriented outward, and Corollary 15.2 applies. The surface Si is the part of the plane z =1 bounded by the intersection curve of the paraboloid and the plane: 1 =2 - 92 _- or 92 + y2 =1. So S2 is the graph z =g~x, y) =1 over D, which is the disk 92 + y2 <; 1. The downward normal vector to Si is n =(g', g', -1) =(0, 0, -1), and  494 15. VECTOR CALCULUS F F Ea Ea div F > 0 div F < 0 X' 1 D y FIGURE 15.16. Left: An illustration to Example 15.14. The solid region E is enclosed by the paraboloid z = 2 - x2 _ y2 and the plane z = 1. By Corollary 15.2, the flux of a vector field across the part S2 of the paraboloid that has upward orientation can be converted to the flux across the part Si of the plane that has downward orientation. The union of Si and S2 has outward orientation. Right: The divergence of a vector field F determines the density of sources of F. If div F > 0 at a point, then the flux of F across a surface that encloses a small region Ea containing the point is positive (a "faucet"). If div F < 0 at a point, then the flux of F across a surface that encloses a small region Ea containing the point is negative (a "sink"). hence F, = F . n = -F3(x, y, g) = -1 on Si and FF.ndS= F F,(x,y)dA= - dA = -A(D)=-7f. Next, the divergence of F is divF =(z2tan1(y2 + 1)), + (z4ln(x2 + 1)) + (z)z= 0 + 0 + 1=1. Hence, ffdiv F dV = dVf r dz dr d 1 7 = 27r (1 - r2)r dr - 1o 2' where the triple integral has been transformed into cylindrical coordi- nates for E {(x, y, z) zbot - 1 < z < 2 - x2 - g2 - ztop , (x, y) E D}.  115. GAUSS-OSTROGRADSKY (DIVERGENCE) THEOREM 495 The upward flux of F across the paraboloid is now easy to find by Corollary 15.2: ff F.ndS fffdivFdV - ff FndS=1+ - 3.w The reader is again advised to try to evaluate the flux directly via the surface integral to appreciate the power of the divergence theorem! COROLLARY 15.3. The flux of a rotational vector field, whose com- ponents have continuous partial derivatives, across an orientable, closed, piecewise-smooth surface S vanishes: ffcurlA. ndS 0. PROOF. The hypotheses of the divergence theorem are satisfied. There- fore, ffcurl A. -ndS fffdiv curl A dV 0 by Corollary 15.1. D By Helmholtz's theorem, a vector field can always be decomposed into the sum of conservative and rotational vector fields. It follows then that only the conservative component of the vector field contributes to the flux across a closed surface: div(Vf+V x A) =V2f+V- (V xA) =V2f. So the divergence of a vector field is determined by the action of the Laplace operator of the scalar potential f of the vector field. This observation is further elucidated with the help of the concept of vector field sources. 115.4. Sources of a Vector Field. Consider a simple region Ea of vol- ume AVa and an interior point ro of Ea. Let a be the radius of the smallest ball that contains Ea and is centered at ro. Let us calcu- late the outward flux per unit volume of a vector field F across the boundary &Ea, which is defined by the ratio ff8Ea F - n dS/AVa in the limit a - 0, that is, when E shrinks to ro. Suppose that components of F have continuous partial derivatives. By virtue of the divergence theorem and the integral mean value theorem, lim ~ fF - ndS = imj ffdiv F dV =div Fro).  496 15. VECTOR CALCULUS Indeed, by the continuity of div F, and the integral mean value the- orem, there is a point rae Ea such that the triple integral equals AVa div F(ra). In the limit a - 0, ra -ro and div F(ra) > div F(ro). Thus, if the divergence is positive div F(ro) > 0, the flux of the vec- tor field across any small surface around ro is positive. This, in turn, means that the flow lines of F are outgoing from ro as if there is a source creating a flow at ro. Following the analogy with water flow, such a source is called a faucet. If div F(ro) < 0, the flow lines disappear at ro (the inward flow is positive). Such a source is called a sink. Thus, the divergence of a vector field determines the density of the sources of a vector field. For example, flow lines of a static electric field originate from positive electric charges and end on negative electric charges. The divergence of the electric field determines the electric charge density in space. The divergence theorem states that the outward flux of a vector field across a closed surface is determined by the total source of the vector field in the region bounded by the surface. In particular, the flux of the electric field E across a closed surface S is determined by the total electric charge in the region enclosed by S. In contrast, the magnetic field B is a rotational vector field and hence is divergence free. So there are no magnetic charges also known as magnetic monopoles. These two laws of physics are stated in the form: div E = 47ro, div B = 0, where o is the density of electric charges. Flow lines of the magnetic field are closed, while flow lines of the electric field end at points where electric charges are located (as indicated by the arrows in the right panel of Figure 15.16). 115.5. Study Problem. Problem 15.9. (Volume of a Solid as the Surface Integral). Let E be bounded by a piecewise smooth surface S =&3E oriented by an outward unit normal vector n. Prove that the volume of E is 1 V(E) =- rdS. SOL UTION: Consider three vector fields F1 =(xc, 0, 0), F2 =(0, y, 0), and F3 =(0, 0, z). Then div F l = 1, divF2 = 1, divF3 = 1.  115. GAUSS-OSTROGRADSKY (DIVERGENCE) THEOREM 497 Then, by virtue of the equality r = F1 + F2 + F3 and by the divergence theorem, fJ n-rdS= fn -F1dS+ ffJ n-F2dS + ff n-F3dS =fff(div F1 + div F2 + div F3) dV = 3 fdV = 3V(E), and the required result follows. D 115.6. Exercises. (1) Find the divergence of the specified vector field: (i) F = Vf, where f = z2 + y2 + z2 (ii) F = r/r, where r = ||r|| (iii) F =af(r), where r =|r and a is a constant vector (iv) F = rf(r), where r =|r||. When does the divergence vanish? (v) F = ag, where a is a constant vector and g is a differentiable function. When does the divergence vanish? (vi) F = a x r, where a is a constant vector (vii) F = a x Vg, where a is a constant vector. When does the divergence vanish? (viii) F = a x G, where a is a constant vector. When does the divergence vanish? (2) Prove the following identities, assuming that the appropriate par- tial derivatives of vector fields and functions exist and are continuous: (i) div(fF) = fdivF+F.Vf (ii) div (F x G) = G - curl F - F - curl G (iii) div (Vf x Vg) (iv) curl curl F = V(div F) - V2F (3) Let a be a fixed vector and let n be the unit normal to a planar closed curve C directed outward from the region bounded by C. Show that ja.ds=O. (4) Let C be a simple closed curve in the xy plane and let n be the unit normal to C directed outward from the region D bounded by C. If A(D) is the area of D, find e r -fnds. (5) Verify the divergence theorem for the given vector field F on the region E: (i) F =(3xc, yz, 3xz) and £ is the rectangular box [0, a] x [0, b] x [0, c]  498 15. VECTOR CALCULUS (ii) F = (3x, 2y, z) and E is the solid bounded by the paraboloid z = a2 - x2 _ 2 and the plane z = 0 (6) Let a be a constant vector and let S be a closed smooth surface oriented outward by the unit normal vector n. Prove that ffa.ndS=O. (7) Evaluate the flux of the given vector field across the specified closed surface S. In each case, determine the kind of source of F in the region enclosed by S (sink or faucet): (i) F = (x2, y2, z2) and S is the boundary of the rectangular box [0, a] x [0, b] x [0, c] oriented outward (ii) F = (x3, y3, z3) and S is the sphere x22+y2+z2 = R2 oriented inward (iii) F = (xy, y2 + sin(zz), cos(yz)) and S is bounded by the para- bolic cylinder z = 1-2 and the planes z = 0, y = 0, y+z = 2; S is oriented outward (iv) F = (-xy2, -yz2, zz2) and S is the sphere x2+ y2 + z2 1 with inward orientation (v) F = (xy, z2y, zz) and S is the boundary of the solid region inside the cylinder x2+y2 = 4 and between the planes z =+2; S is oriented outward (vi) F = (zz2, y3/3, zy2 + cy) and S is the boundary of the part of the ball x2 + y2 + z2 < 1 in the first octant; S is oriented inward (vii) F = (yz, z2x + y, z - cy) and S is the boundary of the solid enclosed by the cone z = x2 + y2 and the sphere x2 + y2 + z2 = 1; S is oriented outward (viii) F = (x+tan(yz), cos(cz) - y, sin(xy) + z) and S is the bound- ary of the solid region between the sphere x2 + y2 + z2 = 2z and the cone z = 2c2+y2 (ix) F = (tan(yz), ln(1 + z2x2), z2 + eYx) and S is the boundary of the smaller part of the ball x2 + y2 + z2 < a2 between two half-planes y =cr/v/3 and y = v/x, r ;> 0; S is oriented inward (x) F = (xy2, cz, zc2) and S is the boundary of the solid bounded by two paraboloidsz =cx2 + y2 andz = 1+cr2 +y2 and the cylinder cr2+ y2 =4; 5 is oriented outward (xi) F =(cr, y, z) and S is the boundary of the solid obtained from the box [0, 2a] x [0, 2b] x [0, 2c] by removing the smaller box [0, a] x [0, b] x [0, c]; S is oriented inward  115. GAUSS-OSTROGRADSKY (DIVERGENCE) THEOREM 499 (xii) F = (x - y + z, y - z + x, z - x + y) and S is the surface I-y+z + ly-z+ + I z- +y| = 1 oriented outward (xiii) F = (x3, y3, z3) and S is the sphere x2 + y2 + z2 = x oriented outward (8) Let Si and S2 be two smooth orientable surfaces that have the same boundary. Suppose that F = curl A and the components of F have continuous partial derivatives. Compare the fluxes of F across Si and S2. (9) Use the divergence theorem to find the flux of the given vector field F across the specified surface S by an appropriate deformation of S: (i) F = (xy2, yz2, zy2 + X2) and S is the top half of the sphere x2 + y2 + z2 = 4 oriented toward the origin (ii) F = (z cos(y2), z2ln(1 + x2), z) and S is the part of the pa- raboloid z = 2 - x2 - y2 above the plane z = 1; S is oriented upward (iii) F = (yz, zz, zy) and S is the cylinder x2+y2 = a2, 0 < z < b, oriented outward from its axis of symmetry (iv) F = (yz + X3, x2z3, zy) and S is the part of the cone z = 1 - cc2 + y2 oriented upward (10) The electric field E and the charge density a are related by the Gauss law div E = 47ra. Suppose the charge density is con- stant, (7= k > 0, inside the sphere x2 + y2 + z2 = R2 and 0 oth- erwise. Find the outward flux of the electric field across the ellipsoid x2/a2 + y2/b2 + z2/c2 = 1 in the two following cases: first, when R is greater than any of a, b, c; second, when R is less than any of a, b, c. (11) Let F be a vector field such that div F = c60= const in a solid bounded region E and div F = 0 otherwise. Let S be a closed smooth surface oriented outward. Consider all possible relative positions of E and S in space (the solid region bounded by S may or may not have an overlap with E). If V is the volume of E, what are all possible values of the flux of E across S? (12) Use the vector form of Green's theorem to prove Green's first and second identities: ff fV2gdA J(fVg) -nds- J- gV2fdA, //f(fV2g -f2g)dA =j(fVg - gVf) - nds, where D satisfies the hypotheses of Green's theorem and the appropri- ate partial derivatives of f and g exist and are continuous.  500 15. VECTOR CALCULUS (13) Use the result of Study Problem 15.9 to find the volume of a solid bounded by the specified surfaces: (i) The planes z =+c and the parametric surface x= a cos u cos v+ b sin usinv, y= acos u sinv - b sin ucosv, z = c sin u (ii) The planes x = 0 and z = 0 and the parametric surface x u cos v, y = u sin v, z = -u + a cos v, where u > 0 and a > 0 (iii) The torus x = (R + a cos u) sin v, y = (R + a cos u) sin v, z a sin u (14) Use the results of Study Problem 15.4 to express the divergence of a vector field in cylindrical and spherical coordinates: _ _1F(rFr)1 &Fz or rO & z' 1 &(p2F,) 1 F((sin@F4) &F0 F+ + . p2 & p p sin #5 8 6 Hint: Show & eP/0#= = 6, & 6/&O = sin 0 de, and similar relations for the partial derivatives of other unit vectors. (15) Use the results of Study Problem 15.4 to express the Laplace operator in cylindrical and spherical coordinates: 2f -1 a ( Of 1 (2f 12f rOrrar k r r2 020 + z2' v2 1 a 2 Of/1 1 a - f8/1 a2. p2&ap kjp} p20 sn J+8# p2 si2 O2 Hint: Show & eP/0#= = 6, & 6,O/ = sin 0 de, and similar relations for the partial derivatives of other unit vectors.  Acknowledgments The author would like to thank his colleagues Dr. David Groisser and Dr. Thomas Walsh for their useful suggestions and comments that were helpful to improve the textbook. 501