Section 1.1


       Differential Equations        Calculus: Areas And Tangents


The study of calculus begins with questions about change. What happens to the velocity of
a swinging pendulum as its position changes? What happens to the position of a planet as
time changes? What happens to a population of owls as its rate of reproduction changes?
Mathematically, one is interested in learning to what extent changes in one quantity affect
the value of another related quantity. Through the study of the way in which quantities
change we are able to understand more deeply the relationships between the quantities
themselves. For example, changing the angle of elevation of a projectile affects the distance
it will travel; by considering the effect of a change in angle on distance, we are able to
determine, for example, the angle which will maximize the distance.
    Related to questions of change are problems of approximation. If we desire to approxi-
mate a quantity which cannot be computed directly (for example, the area of some planar
region), we may develop a technique for approximating its value. The accuracy of our tech-
nique will depend on how many computations we are willing to make; calculus may then
be used to answer questions about the relationship between the accuracy of the approxi-
mation and the number of calculations used. If we double the number of computations,
how much do we gain in accuracy? As we increase the number of computations, do the
approximations approach some limiting value? And if so, can we use our approximating
method to arrive at an exact answer? Note that once again we are asking questions about
the effects of change.
    Two fundamental concepts for studying change are sequences and limits of sequences.
For our purposes, a sequence is nothing more than a list of numbers. For example,
                                      1 1 1
                                      1, -, -,8, .. .
might represent the beginning of a sequence, where the ellipsis indicates that the list is
to continue on indefinitely in some pattern. For example, the 5th term in this sequence
might be
                                      1     1
                                      16   24'
the 8th term
                                      1     1
                                      128 27'
and, in general, the nth term
                                         1
                                       2n-1
where n = 1, 2, 3, .... Notice that the sequence is completely specified only when we have
given the general form of a term in the sequence. Also note that this list of numbers is


1


Copyright @ by Dan Sloughter 2000


﻿


2


Calculus: Areas And Tangents


Section 1.1


approaching 0, which we would call the limit of the sequence. In the next section of this
chapter we will consider in some detail the basic question of determining the limit of a
sequence.
    The following two examples consider these ideas in the context of the two fundamental
problems of calculus. The first of these is to determine the area of a region in the plane;
the other is to find the line tangent to a curve at a given point on the curve. As the course
progresses, we will find that general methods for solving these two problems are at the
heart of the techniques used in calculus. Moreover, we will see that these two problems
are, surprisingly, closely related, with the area problem actually being the inverse of the
tangent problem. This intimate connection was one of the great discoveries of Isaac Newton
(1642-1727) and Gottfried Leibniz (1646-1716), although anticipated by Newton's teacher
Isaac Barrow (1630-1677).

Example Suppose we wish to find the area inside a circle of radius one centered at the
origin. Of course, we have all learned that the answer is r. But why? Indeed, what does
it mean to find the area of a disk?
    Area is best defined for polygons, regions in the plane with line segments for sides.
One can start by defining the area of a 1 x 1 square to be one unit. The area of any other
polygonal figure is then determined by how many squares may be fit into it, with suitable
cutting as necessary. For example, it is seen that the area of a rectangle with base of length
b and height a should be ab. Since a parallelogram with base of length b and height a may
be cut and pasted onto a rectangle of length b and height a (see Problem 1), it follows
that the area of such a parallelogram is also ab. As a triangle with height a and a base
of length b is one-half of a parallelogram of height a and base length b (see Problem 2), it
easily follows that the area of such a triangle is jab. The area of any other polygon can
be calculated, at least in theory, by decomposing it into a suitable number of triangles.
However, a circle does not have straight sides and so may not be handled so easily. Hence
we resort to approximations.


                                            (0,(1)


                            (-1, 0)                     (1, 0)


Figure 1.1.1 A regular octagon inscribed in a unit circle


﻿


Section 1.1


Calculus: Areas And Tangents


3


(-1, 0)r


(0, 1)

  27c
  8

            (1, 0)


\ -7 -, I      W           1 1, -7 -,


I I


(0, -1)


      Figure 1.1.2 Decomposition of a regular octagon into eight isosceles triangles


    Let P be a regular n-sided polygon inscribed in the unit circle centered at the origin
and let An be the area of P,,. For example, Figure 1.1.1 shows P8 inscribed in the unit
circle. We may decompose P into n congruent isosceles triangles by drawing line segments
from the center of the circle to the vertices of the polygon, as shown in Figure 1.1.2 for P8.
For each of these triangles, the angle with vertex at the center of the circle has measure
    degrees, or 2 radians, where 7 represents the ratio of the circumference of a circle to
its diameter. Hence, since the equal sides of each of the triangles are of length one, each
triangle has a height of
                                      hn = cos -
                                                n


and a base of length
                                     bn = 2 sin -
                                                n
(see Problem 3). Thus the area of a single triangle is given by


1         c ( s (
  -ba n =cos   - sin   -
2             n-n


- sin (),


where we have used the fact that

                                sin(2a) = 2 sin(a) cos(a)

for any angle a. Multiplying by n, we see that the area of P is

                                          n     27r
                                   An = - sin(-)
                                          2      n

    We now have a sequence of numbers, A1, A2, A3, ..., each number in the sequence
being an approximation to the area of the circle. Moreover, although not entirely obvi-
ous, each term in the sequence is a better approximation than its predecessor since the


﻿


4


Calculus: Areas And Tangents


Section 1.1


corresponding regular polygon more closely approximates the circle. For example, to five
decimal places we have
                                     A3 = 1.29904,
                                     A4 = 2.00000,
                                     A5 = 2.37764,
                                     A6 = 2.59808,
                                     A7 = 2.73641,
                                     A8 = 2.82843,
                                     A9 = 2.89254,
                                     A10 = 2.93893,
                                     All = 2.97352,
and
                                    A12 = 3.00000.
Continuing in this manner, we find A20= 3.09017, A50= 3.13333, and A100= 3.13953.
As we would expect, the sequence is increasing and appears to be approaching 7. Indeed,
if we take a polygon with 1644 sides, we have A1644 = 3.14159, which is 7 to five decimal
places.
    Alternatively, instead of defining 7 to be the ratio of the circumference of a circle to
its diameter, we could define it to be the area of a circle of radius one. That is, we could
define 7 to be the limiting value of the sequence An. Symbolically, we express this by
writing
                                     77= lim As.
                                         n-oo
In that case, let B be the area of a circle of radius r and let Bn be the area of a regular
n-sided polygon Qn inscribed in the circle. If we decompose Qn into n isosceles triangles
in the same manner as Pn above, then each triangle in this decomposition is similar to
any one of the triangles in the decomposition of Pn. Since the ratios of the lengths of
corresponding sides of similar triangles must all be the same, the sides of a triangle in the
decomposition of Qn must be r times the length of the corresponding sides of any triangle
in the decomposition of Pn. Hence each of the triangles in the decomposition of Qn must
have a base of length rbn and a height of rhn, where hn is the height and bn is the length
of the base of one of the isosceles triangles in the decomposition of Pn. Thus the area of
one of the triangles in the decomposition of Qh into isosceles triangles will be
                                1              1
                                -(r bn) (rhn) =_-r bnhn,
                                2              2
from which it follows that


Since r is a fixed constant, we would then expect that, in the limit as the number of sides
grows toward infinity,
                       B =lim B,        lim = r2 lim Aw r2
                            n2-o0      r22A+n0o


﻿


Section 1.1                      Calculus: Areas And Tangents                      5

                                  8

                                  6

                                  4

                                  2


                    -2     -1              1      2      3       4
                       -2  -1

      Figure 1.1.3 Parabola y = x2 with tangent line (blue) and a secant line (red)


Hence we arrive at the famous formula for the area of a circle of radius r, in which the
constant r has been defined to be the area of a circle of radius one.
Example In this example we wish to find the line tangent to the curve y = x2, a
parabola, at the point (1, 1) . This problem may not at first seem as useful as that of
finding the area of a planar region, but we shall find that the ideas behind the solution
have many applications, and are, ultimately, important in the solution of the area problem
as well.
    First there is the question of exactly what is a tangent line. At the present it will be
sufficient to leave the notion at an intuitive level: a tangent line is a line which just touches
a given curve at a point, giving a close approximation between curve and line. In Chapter
3, we will see that a line f is tangent to a curve C at a point P on C if f passes through
P and, in a sense that we will make precise at that time, gives a better approximation to
C for points close to P than any other line.
    Now let C be the curve with equation y = x2, let P = (1, 1), and let f be the line
tangent to C at P. Since f passes through P, in order to find the equation of f we need
only find its slope m. Unfortunately, to find m in the standard way we need to know
two points on f, and we know only one, namely P. Hence we will again have to resort to
approximations. For example, the line through the points (1, 1) and (2, 4) is not f (it is a
secant line, rather than a tangent line), but since it intersects C at P and at another point
which is close to P, its slope should approximate m (see Figure 1.1.3). Namely, we have

                                        4-1
                                   m          = 3.
                                      m 2-1 zz

Since (2, 4) is on C and is closer to P than (2, 4), a better approximation is given by the
slope of the line passing through (1, 1) and (2, 4), that is,

                                     9        5
                                     m 4-1     -
                                     --1      -    2
                                     2        2


﻿


6                      Calculus: Areas And Tangents                      Section 1.1

More generally, let n be a positive integer and let mn be the slope of the line through the
points
                                      1       1)2
                                   1+ -, 1
                                      n    +n

and P. For example, we have just seen that m1= 3 and m2 = 2. Now, in general,


                                          21  2
                                       l+-     -1


                                  mft=ft2
                                            1
                                       1+-)-1
                                           n
                                           2   1
                                      1+-+2-1
                                      _n n2
                                            1
                                            n
                                        2 1
                                   =n3-+ -+


                                          3 3,
                                          1
                                   =2+-

for n = 1, 2, 3, .. .. Hence
                                           17


                                 m4    2 + -    -
                                           4    4
                                           1    11
                                 m    = 2 + - = -
                                           5    5'
and so on. Moreover, as n increases,  decreases toward 0, and so we would expect that
as n increases, mn decreases toward 2. At the same time, as n increases ma more closely
approximates m. Thus we should have

                                                   1
                          m = lim m    = =lim  2 + -   = 2.
                              n-oo       n-oo      nt

That is, the slope of the line tangent to C at P is 2. Then the tangent line   has equation

                                  y - 1 = 2(x - 1),

or
                                     y =2w - 1.

Here we have used the fact that the equation of a line with slope m and passing through
the point (a, b) is given by
                                  y - b =m(w - a).

    The rest of this chapter will be concerned with the study of sequences and their limits.
The next section will consider the basic definitions and computational techniques, while


﻿


Section 1.1


Calculus: Areas And Tangents


7


the remaining sections will discuss some applications. We will return to the problem of
finding tangent lines in Chapter 3 and the problem of computing areas in Chapter 4.

Problems

1. Use Figure 1.1.4 to verify that a parallelogram with height a and base of length b has
    area ab.


           b
Figure 1.1.4 A parallelogram


2. Explain how any triangle is one-half of a parallelogram, and use this to verify the
   formula for the area of a triangle.

3. Use Figure 1.1.5 to verify the formulas given for the height and base of one of the
   isosceles triangles in the decomposition of P.


            Figure 1.1.5 An isosceles triangle from the decomposition of Pn

4. Try the procedure of the tangent example to find the equation of the line tangent to
   the following curves at the indicated point.


(a) y = 2x2 at (1, 2)
(c) y = x3 at (1, 1)


(b) y =x2+     at (1, 2)
(d) y = x2 at (2, 4)


5. For the area example, find the number of sides necessary for the area of the inscribed
   polygon to approximate 7 to 6, 7, 8, 9, and 10 digits after the decimal point.

6. For the tangent example, how large would n have to be in order for Imn - 2| to be less
   than 0.005?


﻿


8


Calculus: Areas And Tangents


Section 1.1


7. For the tangent example, let p be the smallest positive integer such that Imp -2| < 0.01.
    (a) What is p?
    (b) What can you say about |mn - 2| for values of n greater than p?

 8. For each of the following sequences {a1}, compute ao, a20, a100, a500, and a1000.

    (a) an = n sin(-
                   1 n
    (b) an =  1+ -
                  n
             102
    (c) an=      , where n! = n(n - 1) (n - 2) -.-. (2) (1)
              n!
 9. As we saw in the area example, there is more than one way to define the number w.
    For example, we can define it either as the area of a circle of unit radius or as the
    ratio of the circumference of a circle to its diameter (of course, if the latter approach is
    taken, one has to show that this ratio is the same for every circle). Suppose we define
    7 as the area of a circle of unit radius. Consider a circle with radius r, diameter d,
    circumference C, and area A. Then we have seen that A =w7r2. The following steps
    show that we also have  C=
    (a) Let P be a regular n-sided polygon inscribed in the circle. Let s be the length of
        a side of P. By dividing P into n equal isosceles triangles as we did in the area
        example, argue that
                                               ftrs
                                           A~      .
                                                 2

    (b) Can you see why as n goes to infinity, ns approaches C?
    (c) Now can you see why
                                               nrs    rC
                                     A=lim       -       .?
                                         n-oo 2       2

    (d) Use the result in part (c) to show that

                                                C


10. You may find an interesting discussion of techniques for computing areas and volumes
    up to the time of Archimedes (287-212 B.C.) in the first two chapters of The Historical
    Development of Calculus by C. H. Edwards (Springer-Verlag New York Inc., 1979).
    In particular, there is a discussion on pages 31-35 of Archimedes' proof that the two
    definitions of w mentioned in the area example yield the same number.


﻿


        Difference Equations           Section 1.2
                to
       Differential Equations       Sequences


Recall that a sequence is a list of numbers, such as

                                   1,2,3,4,...,
                                   2,4,6,8,...,
                                     1 2 3

                                       1 1    1
                                   1,--, 4-8,.
                                   '24       8
or
                                   1, -1, 1, -1,...


As we noted in Section 1.1, listing the first few terms of a sequence does not uniquely specify
the remaining terms of the sequence. To fully specify a sequence, we need a formula that
describes an arbitrary term in the sequence. For example, the first example above lists the
first four terms of the sequence {an} with

                                        an = n

for n = 1, 2, 3, ...; the second example lists the first four terms of {bn} with

                                       bn = 2n

for n = 1, 2, 3, . ..; the third example lists the first four terms of {cn} with

                                               1
                                        Cn1--
                                               n

for n = 1, 2, 3, .. .; the fourth lists the first four terms of {dn} with


for n = 0, 1, 2, 3, ...; and the fifth lists the first four terms of {en} with

                                      en = (-1)n


for n = 0, 1, 2, ....


1


Copyright @ by Dan Sloughter 2000


﻿


2


Sequences


Section 1.2


    As indicated in Section 1.1, we are often interested in the value, if one exists, which
a sequence approaches. For example, the sequences {a1} and {b1} increase beyond any
possible bound as n increases, and hence they have no limiting value. To visualize what
is happening here, you might plot the points of the sequence on the real line. For both
of these sequences, the plotted points will march off to the right without any upper limit.
Although a limit does not exist in these cases, we usually write

                                       lim an = o0


and
                                       lim b12=o00

to express the fact that the limits do not exist because the terms in the sequence are
growing without any positive bound. On the other hand, if we plot the points of the
sequence {cn }, as in Figure 1.2.1, we see that although they are always increasing (that
is, moving toward the right), nevertheless they never increase beyond 1. Moreover, even
though no term in the sequence is ever equal to 1, we can see that the points become
arbitrarily close to 1. Hence we say that the limit of the sequence is 1 and we write

                                       lim c1 = 1.


                              C1             C2  C3 C4C5

                              0                            1
                     Figure 1.2.1 The first five values of c1 = 1 - 1


Even though they oscillate between positive and negative values, the terms in the sequence
{dn} approach closer and closer to 0 as n increases. Since it is possible to make do as close
as we like to 0 by taking n suitably large, we may write

                                       lim do = 0.
                                       1- 00

Finally, for the sequence {en} there are only two points to plot, alternating between 1
and -1. Since the terms of this sequence oscillate between two numbers, and so do not
approach any fixed limiting value, we say that the sequence does not have a limit.
    Another approach to visualizing the limiting behavior of a sequence {a12} is to plot the
ordered pairs (nt, a12) in the plane for some range of values of ft. For example, Figure 1.2.2
shows a plot of the points (nt, c12), nf= 1, 2, 3,... ,50 for the sequence {c12} given above.
Note how the points approach the horizontal line y =1, indicating, as mentioned above,
that
                                       lim c1 = 1.


﻿


Section 1.2


Sequences


3


1 r


0.8

0.6

0.4

0.2


10       20       30       40        50


Figure 1.2.2 Plot of (n, 1


) for n = 1, 2, 3, ... , 50


14


0.5


-0.5


-1


..


2


4


6


8


10


Figure 1.2.3 Plot of (n, nn) for n


Similarly, Figure 1.2.3 shows a plot of the points (n, dn), n = 0, 1, 2, ... , 10; here the points
approach the horizontal axis, y = 0, consistent with our claim that

                                       lim do = 0.
                                       n-oo

Figure 1.2.4 shows a plot of (n, en), n = 0,1, 2,... , 20. The fact that this sequence does
not have a limit is manifest in seeing the vertical coordinate of the points oscillate between
1 and -1.
    As the concept of a limit is fundamental to the understanding of calculus, it is im-
portant that we make the notion more concrete than we have so far. That is, we need
to have a formal definition of limit which exactly captures what we have been discussing
intuitively. The idea is that we should say L is the limit of a sequence {an} if for any open
interval I containing L, no matter how small, we can find a point in the sequence beyond
which all values of the sequence lie in I. Graphically, this means that if we start plotting
the points of the sequence, there will come a time when all points from then on will lie


﻿


4


Sequences


Section 1.2


1 On


0.5


-0.5


-1


i


5


10


15


20


Figure 1.2.4 Plot of (n, (-1)n) for n


0,1,2,...,20


within the interval I. This idea is formalized in the following definition, where the open
interval I is expressed in the form (L - E, L + E) and the idea that all values of the sequence
beyond a certain point are in this interval is expressed by requiring that |an - L| <cE, that
is, the distance between an and L is less than c, for all n> N.

Definition We say that the limit of the sequence {an} is L, written

                                       lim an = L,
                                       n-oo

if for every c > 0 there exists an integer N such that |an - L| <cE whenever n> N.

    Hence to show that the limit of a sequence is a number L, one must show that
for any positive number c, it is possible to find an integer N such that the numbers
aN+1, aN+2, aN+3, ... are all in the interval (L - E, L + E). See Figure 1.2.5.

                                  ,  N.1 af.4 N +3  N+2


L- E           L


W


Figure 1.2.5 an in (L - E, L + E) for n> N


Example We will show that


      1
 lim - = 0.
n-oo n


To do so, we must show that for any given c> 0, we can find an integer N such that

                                        1
                                        --0 <E
                                        n


whenever n> N. Now


1
--0
n


1
-,


﻿


Section 1.2                               Sequences                               5

so we need only determine the values of n for which

                                       1
                                       - < E.
                                       n

Since
                              1                      1
                              - <cE if and only if n> -,
                              nE
it follows that we may take N to be the largest integer less than or equal to . Then
whenever n> N, we have
                                           1
                                       n > -,
                                           E
from which it follows that
                                       1
                                       n
This is exactly what we need in order to conclude, by the definition, that

                                          1
                                     lim - = 0.
                                     n-oo n

    The following definition is useful in situations, such as in the previous example, when
we want the largest integer less than or equal to some given value.

Definition For any real number x, we may define the floor function, denoted Lx], by

                   L] =the largest integer less than or equal to x,     (1.2.1)

and the ceiling function, denoted Fwl, by

                 I] |=the smallest integer greater than or equal to x.    (1.2.2)


    For example, L5.3] = 5, |7rl= 4, L3] = 3, and |3| = 3. With this notation, we could
define N in the previous example by

                                        N=1


Example We will show that
                                          1
                                     lim - =O.
                                     n-oo 2
This time we must show that for any ec> 0, we can find an integer N such that

                                      1


﻿


6


Sequences


Section 1.2


whenever n> N. Now


1
--o
2n2


1
212


(1 "


so we need to determine the values of n for which


We need to solve this inequality for n. Since n is in the exponent, we may use logarithms
to simplify the inequality. Although we will not provide a careful treatment of logarithms
until Chapter 6, we will assume for the moment some acquaintance with logarithms using
base 10. Now


('12h
2~)


<


if and only if


Since


  log10


logio 1


     (X


  n log10 (A)


we have


< logio(e).


n logio2,


< login(E).


if and only if


Now log1o (A) <0, so


n logio (A)


< logio(e)


if and only if


n >          .og0(E)
    logio ( )


Thus if we let


N -_1o10 ()


then


1
2--0<E


﻿


Section 1.2                                Sequences                                 7

whenever n> N. For example, if we take c= 0.001, then, to two decimal places,

                                    1og10(e)
                                           1= 9.97,
                                   log10 (2)

and so we would have
                                    N = 9.97 =9.

This N works because, for n> 9,

                           1         1     1      1
                           --0       -n<   -o< 0.001.
                           22- 210               1024

    Problem 12 at the end of this section will ask you to generalize the previous example
to show that
                                      lim r = 0

whenever Irl < 1. This is an important fact that we will make use of later.
    In this course we will be concerned more with the development of an intuitive un-
derstanding of limits and a computational facility with limits than with the formalism of
verifying a specific limit using the above definition. That is not to say that the definition
is unimportant; rather a good grasp of the concept in the definition is important for a full
understanding of much of what we will do in calculus. In fact, mathematicians of the 19th
century arrived at the definition we have stated in their attempts to clarify confusions that
had developed in mathematics since the time of Newton and Leibniz. However, for the
most part these difficulties are beyond the scope of a text such as this one.
    We will see that a few basic properties of limits, combined with a few simple limits like
the ones in the previous two examples, will enable us to compute easily a large number
of limits. To begin considering these properties, consider the case where we already know
that
                                      lim an = L                                 (1.2.3)

and we want to compute
                                        lim kan

for some constant k # 0. Now (1.2.3) tells us that for any c> 0, we may find an integer
N such that for n> N,
                                                E
                                                |k|

It follows that for nt> N,

                          ka- k L|     k ||a - L| < |k|   =.
                                                       |k|

But this is what it means to say that


lim  kan = kL.
12- 00


(1.2.4)


﻿


8


Sequences


Section 1.2


Note that (1.2.4) is obviously true as well when k
proposition.


0. Hence we have the following


Proposition


If {an} is a sequence for which


lim an = L,
12- 00


then for any constant k we have


lim kan = k lim an = kL.
n2-oo        n1-o0


(1.2.5)


Example


Since we have already seen that


      1
 lim - = 0,
n-o n


it follows that


     350
 lim -5
n2-o0 nt


         1
350 lim -= (350)(0)
    n-0o fT


0.


Now suppose we have two sequences {an} and {bn} with

                                  lim an = L
                                  12- 00


(1.2.6)


(1.2.7)


and


lim bn = M.
12-- 0


Then (1.2.6) and (1.2.7) tell us that for any c > 0, we can find integers N1 and N2 such


that


         E
|n- L| < -
         2


whenever n> N1 and
                                     |bn - M <5-
                                                 2
whenever n> N2. If we let N be the larger of N1 and N2, then whenever n> N we will
have


(an + bn)


(L + M)| = |(an


L) + (bn - M)|


< |a-L + ba-M


(1.2.8)


                                             E    E
                                             C- +- - =E.
                                             2   2
Note that in (1.2.8) we have used the fact, known as the triangle inequality, that for any
real numbers x and y,


Ix + y| <;IC| +  lyl.


Thus we have shown
                                lim (an + bn) = L + M.

Hence we have the following proposition.


(1.2.9)


(1.2.10)


﻿


Section 1.2

Proposition


                             Sequences

If {an} and {b~} are sequences with


9


urn an
n2-o 0


=L


and


then


urm n   = M
1- 00


urn (an + bn )
n2-o 0


urn an + urn bn
n2-o00    n-o0


L+M.


(1.2.11)


Example We have


rn (4 +
1-- 00


8 '~8
-; = l)  4 +  lim  -


4+8 urn -
        1 -00f


4+ (8)(O)


4.


    Note that in the last example we used the fact that if k is a constant and an
all nt, then
                                       lim an    k.
                                       12-o 0
This follows immediately from the definition since

                                         an   I-  0

for all values of k, and so any integer N will work for any value of E.
    Again suppose we have two sequences {a~} and {b~} with

                                        lim an =L
                                        12-o 0


k for


and


lim  b   =M
12-- 00


Then we have


lim (a,,
n2-o 0


bn) - lim an + lim (-ba)
      n2-o00     n-o0


lim an + (-
n2-o 0


1) lim bn
   n2-o 0


L -M.      (1.2.12).


Proposition


If {an} and {b~} are sequences with


lim an~
12- 00


lim  bn    M
12- 00


and


then


lim (a,,
n-oo


li(a~ bn) - lim  an -  lim  bn - .(..3
n2~001-o00   n-o0


L-M.


(1.2.13)


﻿


10


Sequences


Section 1.2


Example We have


       3
n-o n


8)
5n


       1
3 lim -
  n-o fn


8 lim (1
  n-o  5)


(3)(0) - (8)(0) = 0.


Note that we have used the result that

                                        lim r = 0

whenever Irl < 1.

    We will state three more properties of limits without justifications. Although the
reasoning behind these results is similar to the reasoning of the previous three propositions,
they require a little more care and are best left to a more advanced course.
Proposition     If {an} and {bn} are sequences with

                                        lim an = L
                                        12- 00


and


then


lim b1 = M,
12- 00


lim  a2b12   ( lim an)( lim  bn) = LM.
n2-o00        n-oo      n1-o0


(1.2.14)


Example We have


                            1            1          1
                       lim   2 =    lim -      lim -
                       n- o n122 n-o n12 n- o n

Proposition  If {an} and {bn} are sequences with

                                        lim an = L
                                        12- 00


(0)(0)   0.


and


then


lim b1 = M,
12- 00


lim an
n-oo bn


lim an

lim bn
n-oo


L
M'


(1.2.15)


provided M # 0 and b12 # 0 for all n.
Example We have


lim   n   3
n-oo2nt+4


      n-3

 lim    n
n-oo 2n+4
        n


          3
      1--
 lim
n-oo2+-4
          n


lim 1 --
"  oo      n
lim (2+4
n-oo       T n/


1
2


﻿


Section 1.2


Sequences


11


Note that we can apply the previous proposition only when both numerator and denomina-
tor have a limit. Hence, in this example, we first divided the numerator and denominator
by n to put the problem in a form to which we could apply the proposition.

Proposition Suppose {an} is a sequence with

                                   lim an = L.
                                   n-o

Moreover, suppose p is a rational number, aP is defined for all n, and LP is defined. Then


lim ap= (lim an) = LP.
n-o       n-o


(1.2.16)


Example We have


           3
 lim 4- -
n-fo      n


/1im-(4 -f3/


4=2.


Example For any rational number p > 0, we have


lim -
n-o nP


      1 \P
(lim -
n-o tn


Op =0.


Example


  lim  18
  n-o


Example


We have

5 23
We h5a


We have


                11
 lim 18 - 5 lim -+23 lim 15= 18 - (5)(0) + (23)(0) = 18.
n2o0       12oo-n oon


.  4n5+5f2-6
m1
n-oo 3n5 + 4n - 18


     4im 5  6 3 f5
n-o      4    18
         n4   n5


   l~ 4 + 5     6 o 3 n
 lim  4 + 4  -8

          4  185
n-o       n4   n


4
3


   In general, for sequences of the form of the previous example it is useful to divide both
numerator and denominator by the highest power of n which occurs in the denominator.

Example As another illustration of the idea in the previous example, we have


.  3n2+2n--1
hm
n--- 2n3 - 16n


     3    2    1
 lim n   n2
n-o     2 -16
           n2


0
2


0.


﻿


12


Sequences


Section 1.2


Definition If lim an exists, we say the sequence {an} converges. If the sequence {an}
does not have a limit, we say the sequence diverges.

    An important class of divergent sequences are those for which a limit does not exist
either because the terms grow without an upper bound or because they decrease without
any lower bound, as defined in the following definition.

Definition A sequence {an} is said to diverge to infinity if for any real number M there
exists an integer N such that an > M whenever n> N, in which case we write

                                      lim an = 00.

A sequence {an} is said to diverge to negative infinity if for any real number M there exists
an integer N such that an < M whenever n> N, in which case we write

                                     lim an = -oo.


Example Clearly
                                      lim nP = o0

for any value of p > 0. For given any M, we need only take


                                     N=L S/MI

to guarantee that an > M whenever n> N.

Example We have
                                      lim 2n =o00
                                      12- 00
since, given any M, 2n > M for all n if M < 0 and 2n > M provided

                                         log10(M)
                                         logi0(2)

if M>0.

    Suppose the sequence {an} diverges and k # 0 is a constant. Then the sequence {kan}
must also diverge since if {ka,} converged, then the sequence with nth term


                                     -(kan) =a

would also converge, contradicting our assumption that {an} diverges.

Proposition If the sequence {an} diverges and k # 0 is a constant, then the sequence
{kan} also diverges.


﻿


Section 1.2                                Sequences                                13

    If the sequence {an} diverges and the sequence {bn} converges, then the sequence
{an + bn} also diverges since, if it converged, then the sequence with nth term

                                  (an + bn) - bn = an

would also converge, contradicting our assumption that {an} diverges. Similarly, the
sequence {an - bn} diverges.
Proposition If the sequence {an} diverges and the sequence {bn} converges, then the
sequences {an + bn} and {an - bn} both diverge.
    Suppose the sequence {an} diverges, the sequence {bn} converges, and

                                      lim bn # 0.                               (1.2.17)
                                      n-oo

Now (1.2.17) implies that we can find an integer N such that bn # 0 for all n> N. So if
the sequence {anbn} converged, then the sequence with, for n > N, nth term,

                                     1
                                       (anbn) = a
                                     bn

would also converge, contradicting our assumption that {an} diverges. Hence {anbn} must
diverge.
Proposition If the sequence {an} diverges, the sequence {bn} converges, and

                                      lim bn # 0,
                                      n-oo

then the sequence {anbn} diverges
    Finally, if the sequence {an} diverges, the sequence {bn} converges, and bn # 0 for all
n, then the sequence


diverges since, if it converged, the sequence with nth term


would also converge, contradicting our assumption that {an} diverges.

Proposition If the sequence {an} diverges, the sequence {bn} converges, and be # 0 for
all nt, then the sequence


diverges.


﻿


14


Sequences


Section 1.2


Example Consider


     4n3+n-2
 lim
n--- 5n2_ 7n


          1   2
    4n+- - 2
lim           ft
00 o7
       5--
           n


(1.2.18)


Now


and


lim 4n = c
n-o


      i1
 lim  -
n-o fn


2
   =   0'
n2


so
                                     ( 1
                             lim   4n +-
                             n-o        n


2
  2J= 00.
n2


Moreover,

                                 lim   5-) =5.
                                 n-o      nr

Thus the numerator in (1.2.18) diverges while the denominator converges. Hence the ratio
diverges. In fact, it should be clear that


n2-c 05nt2- 7nt


          1    2
     4n+- - 2
 lim        7     -C2D.
n-o     5--
            n


    Note that in the previous example it was once again useful to divide numerator and
denominator by the highest power of n in the denominator.

Example We have


     15 - 26n5
n2-i0o 13 + n2


     15      3
     2 - 26n3
 lim Ti
n -oo  13
        2 1


00.


Example The absolute values of the terms of the sequence {(-2)n} grow without bound,
and so the sequence diverges. However, since the terms alternate in sign, the sequence
neither diverges to 00 nor to -oo.

Monotone sequences
It is sometimes possible to determine that a given sequence converges without explicitly
computing the limit. One important case involves monotone sequences.


﻿


Section 1.2


Sequences


15


Definition We say a sequence {an} is monotone increasing if an   an+1 for all n. We
say a sequence {an} is monotone decreasing if an ;> an+1 for all n. We say a sequence is
monotone if it is either monotone increasing or monotone decreasing.

    Now suppose {an} is a monotone increasing sequence. For such a sequence there either
exists a number P such that an < P for all n or there does not exist such an P. In the latter
case, given any real number M, it is then possible to find integer N such that aN > M.
Since the sequence is monotone, it follows that an > M for all n> N, and so the sequence
diverges to infinity. On the other hand, if there does exist a number P such that an <P
for all n, then there in fact exists a number B such that an < B for all n and B < P
for any number P with the property that an < P for all n. The existence of B, known
as the least upper bound of the sequence {an}, is not at all obvious; indeed, the subtle
properties of the real numbers that imply the existence of B were not fully understood
until the middle part of the 19th century. However, given the existence of B, it is easy
to see that given any c > 0, there exists a integer N for which aN > B - E (if not, then
B - E would be an upper bound for the sequence smaller than B). Since the sequence is
monotone increasing and an < B for all n, it follows that


                                     |an-B| <6E

for all n> N. That is, we have shown that the sequence converges and

                                     lim an = B.
                                     n-oo

Similar results hold for sequences which are monotone decreasing.

Monotone sequence theorem Suppose the sequence {an} is monotone. If the se-
quence is monotone increasing and there exists a number P such that an < P for all n,
then the sequence converges. If the sequence is monotone increasing and no such number
P exists, then
                                     lim an = c0.
                                     n-oo

If the sequence is monotone decreasing and there exists a number Q such that a ;> Q for
all n, then the sequence converges. If the sequence is monotone decreasing and no such
number Q exists, then
                                    lim an = -oo.


Example As we shall see in Sections 1.4 and 1.5, we often work with sequences without
having an explicit formula for each term in the sequence. For example, suppose all we
know about the sequence {an} is that ai1 4 and

                                             1
                                     an+1=±i a


﻿


16


Sequences


Section 1.2


for n = 1, 2,3,.... That is, the first term in the sequence is 4 and then each successive
term is one-half of its predecessor. Thus

                                        a1 =4,
                                        a2 = 2,
                                        a3   1,
                                              1
                                        a4 -2

and so on. Hence {an} is monotone decreasing. Moreover, every term in the sequence is
positive, so an ;> 0 for all n. Thus, by the Monotone Sequence Theorem, {an} converges.
Moreover, note that
                                              1
                                      an+1 = 2 an

implies that


              1
  lim an+1=- lim an.
  n-oo        2 n-o


L = lim an = lim an+1,
    n2-oo      n1-o0


(1.2.19)


If we let


then (1.2.19) becomes


Hence L = 0. That is,


     1
L = -L.
     2


lim an = 0.
12-- 0


Problems


1. For each of the following, find a general expression for
   which would yield these values as the first four terms.


the nth term of a sequence


   (a)    1 1                                        1 1 1
      (a '3' 9'27, .                           b    '2', 3'4'
         3 5 7                                       1 1    1 1
   (c) 1,(d)                                        3'5' 7'9'

2. For each of the following, decide whether the given sequence converges or diverges. If
   the sequence converges, find its limit.


          1
(a) an = 3, n = 0, 1, 2, . . .
          3n2
          3n - 1
(c) b  =        , n = 1, 2, 3,.. ..
         2n+ 6
         3n4 - 6n3 + 1
(e) an =     3           n = 1, 2, 3,.. .
          5n + n2 + 2


(b) an = 7r", n = 0, 1, 2, .. .

(d) c1 = cos(7rn), n = 0, 1, 2, .. .

          2n5 - 3n2 + 23
(f) b1 = =4               , n = 1, 2, 3, .. .
         7n5 + 13n4 - 12


﻿


Section 1.2


Sequences


17


              45 - 16n2
   (g) cn =              , n = 1, 2, 3, . . .
            13 +5n +6na

   (i) an = (-2)2     n = 1,2, 3, ...

              3n2 + n -6
   (k) an =       2       , n = 1, 2, 3, .. .
               53  + 16
3. Explain why


          3/4t + 1
(h) bn = - n2+1 ,n-=1, 2, 3,.. .

         10 - 16n3
 (j) an =       2 , n = 1, 2, 3, . . .

        _(-1)2
 (1) bn =- ____h
    (1) bo =, n   =  0, 1, 2, .. .
           5n


1    sin(n)   1
n  -   n      n


   for n = 1, 2,3,....What can you conclude about lim sin( n)?
                                                  n4oo   n
                 1 n
4. Let an= 1+-       ,n=1,2,3,....
                 n


(a)
(b)

(c)
(d)


Compute a1, a2, a3, a4, and a5 using a calculator.
Compute values of an for n = 1, 2, 3, ... , 200.
Plot the points (n, an) for n = 1, 2, 3, ... , 200, along with the horizontal line y = e.
Does it seem reasonable that lim an = e?
                            12- 00
What is the smallest value of n for which an > 2.7?
What is the first value of n for which |an - el < 0.01? Recall that e = 2.71828 to
five decimal places.


(e)
(f)


5. Let an = n sin  - , n = 1, 2, 3, ....
                  n
   (a) Compute a1, a2, a3, a4, and a5 using a calculator.
   (b) Compute values of an for n = 1, 2, 3, ... , 200.
   (c) Plot the points (n, an) for n = 1, 2, 3, ... , 200, along with the horizontal line y = 1.
   (d) Does it seem reasonable that lim an = 1?

   (e) What is the smallest value of n for which a1 > 0.999?
   (f) What is the first value of n for which an - 1| < 0.0001?

6. Let an = 1.01n and b1 = 0.99n for n = 0,1, 2,.... On the same graph, plot the points
   (n, an) and (n, bn) for n = 0,1, 2,... , 200. How do these two plots compare? Do the
   sequences converge?
            101
7. Let an =     for n = 1, 2, 3, ....
             n!
   (a) Plot the points (n, an) for n = 1, 2, 3, ... , 100.
   (b) From the picture in part (a), can you guess lim an?

   (c) What is the maximum value of a1 for n = 1, 2, 3, ... , 100?


﻿


18


Sequences


Section 1.2


    (d) Can you see why
                                          rkn
                                          lim     =0
                                          n-oo n!
       for any constant k?

 8. Consider the sequence {an} with a1= 10 and

                                               1
                                       an+1 = 3 an

    for n = 1, 2, 3,.... Plot the points (n, an) for n = 1, 2, 3,....50. Do you think this
    sequence has a limit? Can you verify this?

 9. Consider the sequence {an} with a1= 2 and

                                       an+1 = 2an

    for n = 1, 2, 3, .... Plot the points (n, an) for n = 1, 2, 3, ... , 50. Can you find the
    limit of this sequence using the same method you used in part Problem 8? Does this
    sequence have a limit?

10. Consider the sequence {an} with a1= 0.9 and

                                   an+1 = 2an(1 - an)

    for n = 1, 2, 3,.... Plot the points (n, an) for n = 1, 2, 3,... ,100. Do you think this
    sequence has a limit? If so, can you find it?

11. In each of the following, for an arbitrary c > 0, find the smallest integer N for which
    an - L| < E whenever n > N. Verify that your value for N works in the particular
    case c = 0.001.
                 1
    (a) an=1 - -, L =1                        (b) an =0.98n, L= 0
                 n
              1                                         3n3 -1
    (c) a     - , L =0                        (d) an =      3  , L =3
             n                                            n3
12. Show that for any -1 < r < 1, lim r" = 0.
                                  n-oo
13. Find sequences {an} and {bn} such that {an} and {bn} both diverge, but {an + bn}
    converges.

14. Find sequences {an} and {bn} such that {an} diverges, {bn} converges, and {anbn}
    converges.


﻿


Section 1.3


       Differential Equations         The Sum     of a Sequence


This section considers the problem of adding together the terms of a sequence. Of course,
this is a problem only if more than a finite number of terms of the sequence are nonzero.
In this case, we must decide what it means to add together an infinite number of nonzero
numbers. The first example shows how a relatively simple question may lead to such
infinite summations.
Example Suppose a game is played in which a fair coin is tossed until the first time
a head appears. What is the probability that a head appears for the first time on an
even-numbered toss? To solve this problem, we first need to determine the probability of
obtaining a head for the first time on any given even numbered toss, and then we need
to add all these probabilities together. Let Pn denote the probability that the first head
appears on the nth toss, n = 1, 2, 3, ..... Then, since the coin is assumed to be fair,

                                            1
                                      P1=-.
                                            2

Now in order to get a head for the first time on the second toss, we must toss a tail on the
first toss and then follow that with a head on the second toss. Since one-half of all first
tosses will be tails and then one-half of those tosses will be followed by a second toss of
heads, we should have

                                P2     2    2     4

Similarly, since one-fourth of all sequences of coin tosses will begin with two tails and then
half of these sequences will have a head for the third toss, we have

                                      {1   fl     1
                                P3-    4    2      -.
                                P=zzQ() (      )4

Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3, ...,

                                            1
                                      Pn = -.

Hence we have a sequence of probabilities {Pn} for n = 1, 2, 3, ..., and, in order to find the
desired probability, we need to add up the even-numbered terms in this sequence. Namely,
the probability that a head appears for the first time on an even toss is given by

                                           1    1    1
                       P2+P4+P6+= -+ 16+6+.(1.3.1)
                                           4 16 64


1


Copyright @ by Dan Sloughter 2000


﻿


2                          The Sum of a Sequence                            Section 1.3

But this involves adding together an infinite number of nonzero values. Is this possible?
Can we perform the operation of addition an infinite number of times? In this case the
answer is yes, but we will need a few preliminaries before we can finish this particular
example.

    We begin with a definition of the sum of a sequence {an}. The idea is to create a
new sequence by successively adding together the terms of the original sequence. That is,
we define a new sequence {sm} where sn is the sum of the first n terms of the original
sequence. If
                                         lim s
                                         n- oc
exists, then this indicates that, as we add together more and more terms of {an}, the
resulting sums approach a limiting value. It is then reasonable to call this limiting value
the sum of the sequence. For example, if

                                              1
                                        an =


for n= 1, 2, 3, ..., then we would have

                                    1
                              Si = -,
                                    2
                                    1   1    3
                                    2   4   4
                                    1   1   1    7
                              s    - + - + -,
                                    2   4   8    8'
                                    1   1   1    1     15
                              s4=-+ -+       -+   -    -
                                    2   4   8    16    16

and so on. If you plot these points on the real line, you may think of starting at j, moving
1 the distance to 1 to plot the next point, then 1 the remaining distance to 1 to plot the
next point, and so on. After n points, you would be at

                                                1
                                      s =1-     -.(1.3.2)
                                               2n

Clearly,
                                 lim s = li     1  -   =_ 1
                             m-oo      m-oo       2'4

and it would be reasonable to say that the sequence adds up to 1. That is,

                          1   11        1         1
                          2   4    8   16         2    ..4.(133


This idea is formalized in the following definition.


﻿


Section 1.3                          The Sum of a Sequence                           3

Definition Given a sequence {am}, n = 1, 2, 3,..., we define a new sequence {ss} by
letting
                                sm=ai+a2+...+an                                  (1.3.4)

for n = 1, 2, 3, .... If the sequence {sn} converges, then we call

                                      s = lim sm

the sum of the sequence {am}. The sequence {sm} is called an infinite series and an
individual term sn of this sequence is called a partial sum of the sequence {am}.

    Note that we have assumed that the first term in the sequence {an} in the definition
is a1. The sequence could just as well start with any other integer index, in which case the
sequence of partial sums {sm} would start with the same index. For example, if the first
term of the sequence is ao, then the first partial sum is so.
    Since summations involving an infinite number, or even a large finite number, of terms
are cumbersome to write using the standard plus sign of addition, E (the capital Greek
sigma) is used to denote the process of summation. In particular, we would write

                                  m
                            sm=     aj =a1+a2+...+an                             (1.3.5)
                                 j=1

and
                                     m
                 s = lim sn = lim       a* = lim (ai + a2 + - - - + an).     (1.3.6)
                     n-oo       m-oo         m-oo
                                    j=1
Since (1.3.6) is what we mean by an infinite sum, we will in fact write
                              00
                         s =     aj=a1+a2+...+a        +....                     (1.3.7)
                             j=1

For example, in this notation, we may restate our earlier results as

                             n  1    1    1   1    1    15
                                2n   2    4   8   16    16
                            n=1

and
                       1A1i1i          1i              1
                              -- - -+-+       +--+       +---=1.

    We should note that, since the sum of a sequence is the limit of another sequence, and
not all sequences have limits, there are sequences which do not have sums. For example,
the sequence with terms am  1 for nf= 1, 2, 3,... does not have a sum since


                                        nt times


﻿


4                          The Sum of a Sequence                           Section 1.3

from which it follows that
                                      lim s = 00.
                                      n-o
For another example, the sequence {(-l)m}, n = 0, 1, 2, ..., does not have a sum since
                                    {1, if n = 0, 2, 4, . ..,
                                 sa = 0, if n = 1, 3, 5, . ..,

a sequence which clearly does not have a limit.
    In general it may be difficult to determine the sum of a sequence; in fact, it may be
difficult to determine even if the sequence has a sum. We will return to this problem
in Chapter 5 when we have more tools at our disposal, as well as more motivation for
studying infinite series. For now we will look at an important class of sequences for which
the sum is determined with relative ease. These are the sequences for which the terms
are in geometric progression; that is, sequences for which successive terms have a common
ratio. We call the infinite series which corresponds to such a sequence a geometric series.

Geometric series
Suppose {an} is a sequence with an = crn-1, where c # 0 and r are constants and
n = 1, 2, 3, .... Then the partial sums are

                            sa = ai+a2 +as3+.-.-+ an
                               =c+cr+cr2 +.-+cr-1
                               = c(1 +r+r2 +...+r-1).

If r = 1, sm = nc and so {ss} does not converge. If r # 1, it is easy to see, using long
division (or the derivation in outlined in Problem 4), that

                            1 -rn                        -+
                                      =1 r+r2+"-""-+  r"-.(1.3.8)
                            1-r
Hence, if r # 1,
                                         c(1 -nr)
                                    s    c(=. )(1.3.9)
                                           1-r
From (1.3.9), it is clear that {sm} does not converge if Irl ;> 1. But if -1 < r < 1, then

                                     lim r-1 = 0,

and so the sequence has the sum

                         s = lim sa     lim           =(~m    .                 (1.3.10)
                             n-eo      n-eo   1 -r       1- r

That is, we have now seen that


whenever -1 < r < 1.


﻿


Section 1.3


The Sum of a Sequence


5


Example    We have, using (1.3.11) with c = 1 and r 1=},


    (1)-

n=1


    1 11 1
1+-+-+-+   +...
    2   4   8   16


  1
    1=2.
1- -
    2


Note that this agrees with our previous result that

                             1   1   1    1
                             2   4   8   16


Example We have


00   2

n=1


00   2 2 n

    3(3)
n-1


  2

    2
1- -
    3


2.


Example We have


n-~0


00

n-~0


1
  1
  5


5
4


Note that in this example the sum starts with n = 0 instead of n = 1 as in (1.3.11).
However, it is the initial power of r in (1.3.11) that is important, not how we write the
index. Hence, the sum in this example could be written equally well as
                                   o(1Nn-1

                                   n-1(5


or


n=2


K5)


-2


100


or


100


as well as many other ways. The key in applying (1.3.11) is that we identify c and r so
that the first term in the sum is cr0 = c.


Example We have


   4(0.34)?'
n=2


00
   4(0.34)2(0.34)n-2
n=2


4(0.34)2  =
1 - 0.34-0706


where we have used (1.3.11) with c = 4(0.34)2 and r = 0.34.


﻿


6


The Sum of a Sequence


Section 1.3


   We are now in a position to compute the sum in (1.3.1), and hence complete our first
example

Example Let P be the probability that, when tossing a fair coin repeatedly, the first
head appears on an even toss. Then we have seen that

                                                 00
                         P=P2 +P4 +P6 +---=          2n2,
                                                n i

where
                                (1)12  ((1)2)2     (1)12


for n=1,2,3,.... Thus

                                                     1
                        00     n    oon-1

                     P   2  101223'(
                                                      4
                     P n=14        n=1441 1                 .


Example Economists often talk about the multiplier effect of an infusion of money into
an economy which results in new spending many times greater than the original amount
spent. This is a consequence of the recipients of the money spending a certain percentage
of their new money, the recipients of this spending again spending a certain percentage of
their gain, and so on. For example, suppose the government spends three million dollars,
and suppose that at each stage the recipients spend 90% of the money they receive. Then
the first recipients spend (3) (0.9) = 2.7 million dollars, the second recipients spend

                          (3)(0.9)(0.9) =(3)(0.9)2 = 2.43

million dollars (that is, 90% of the 2.7 million spent by the first recipients), the third
recipients spend
                         (3)(0.92)(0.9) =(3)(0.9)3 = 2.187

million dollars, and so on. If we denote the total amount of spending after n transactions
by Sn , then, in millions of dollars,


                        S2   3 + 3(0.9),
                        S3 =3 + 3(0.9) + 3(0.9)2,
                        S4 =3 + 3(0.9) + 3(0.9)2 + 3(0.9)3,

and, in general,
                     Sn = 3 + 3(0.9) + 3(0.9)2 +.-.- + 3(0.9)1--1


﻿


Section 1.3


The Sum of a Sequence


7


for n = 1, 2, 3, .... Although in actuality there will be only a finite number of transactions,
we can see that as n increases the total spending will approach the sum

                                                   3
                           S =      3(0.9)--1 1      .    30
                                                1 - 0.9
                                n=1

million dollars. Thus the initial governmental expenditure of three million dollars results
in approximately 30 million dollars, 10 times the initial amount, in new spending in the
economy. This partially explains why deficit spending by the government in depressed
times can be far more beneficial to the economy than the actual amount spent, and why
such spending during other times can be highly inflationary.
Example This example involves slightly more complicated probabilistic reasoning, as
well as some additional algebraic simplification, before the problem is reduced to the sum-
mation of a geometric series. Suppose that a certain female animal has a 10% chance of
dying during any given year of her life. Moreover, suppose the animal does not repro-
duce during her first year of life, but every year after has a 20% chance of successfully
reproducing. What is the probability that this animal has offspring before dying?
    Let P be the probability that the animal has offspring before dying and let Pn be the
probability that the animal successfully reproduces for the first time in its nth year. Then
                                            00
                                      P =     Pn.
                                           n=1

Note that our sum extends to infinity even though in reality it is highly unlikely that any
such animal would live even to an age of 100 years. We do this because the model we
are using, as with all mathematical models, is an idealization of the real situation. In
this case, by assuming that a given animal of this species has a constant 10% chance of
dying in any given year, we have implicitly assumed that there is no fixed upper bound
to its life-span. Put another way, we have assigned a positive probability to an animal's
living for, as an example, 1000 years, although this probability is very small (namely,
0.91000   1.748 x 10-46) and, hence, is not actually ever going to happen.
    Since we have assumed that these animals cannot reproduce in their first year of life,
we have Pi= 0. To compute P2, we note that 90% of all such females will live through
their first year and that 20% of these will then have offspring successfully. Hence the
proportion of females that successfully reproduce for the first time in their second year is

                                    P2 = (0.9)(0.2).

To compute P3 , first we note that the proportion of females living until their third year
will be (0.9)(0.9) (that is, 90% of the 90% who lived through their first year). Now 80% of
these will not have produced offspring successfully in their second year, so the proportion
of females who reach their third year without having reproduced is (0.9)2(0.8). Finally,
20% of these will have success in reproducing in their third year. Thus


P3 = (0.9)2(0.8)(0.2).


﻿


8


The Sum of a Sequence


Section 1.3


Similar reasoning yields
                                P4 = (0.9)3(0.8)2(0.2)

(that is, this represents a female who has lived through three years, did not reproduce in
either her second or third year, but did have offspring in her fourth year) and, in general,

                              P= =(0.9)-1(0.8)n-2(0.2)

for n = 2, 3, 4,.... Hence

                                00
                          P=      Pn
                               n=1
                               00
                                  (0.9)n-1(0.8)n-2(0.2)
                               n=2
                               00
                                  (0.2) (0.9) (0.9)n-2 (0.8)n-2
                               n=2
                               00
                                  (0.18)(0.72)n-2
                               n=2
                                 0.18
                               1 - 0.72
                             - 0.6429,

where the answer has been rounded to four decimal places. Thus we conclude that a given
female of this species has just over a 64% chance of reproducing during her lifetime.

The harmonic series
It may happen that a sequence {an} does not have a sum even though

                                     lim an = 0.
                                     n - 00

One important example of this behavior is provided by the sequence {an} with

                                             1
                                       an=-
                                            n

for n   1, 2, 3.  The resulting infinite series with ntth partial sum given by

                                        1   1       1
                                           sa   =1+  -+  -  - -- -.(1.3.12)
                                        2 3n

is called the harmonic series. Since

                             1         1     1            1
                             sn1  1+ +- -+ +sa+               >8s,            (1.3.13)
                             2        nt   n+1          n+1i


﻿


Section 1.3


The Sum of a Sequence


9


the sequence {ss} is monotone increasing. Hence, by the Monotone Sequence Theorem,
{ sn} either converges or diverges to infinity. Now

        si =1,
                1
        s2=1+-,
                2


111             1 1 1
2   3   4       2   4   4


11
-+-
2 2


1+2(2)


1   1   1   1   1    1   1       1   1   1   1   1   1   1
2   3   4   5   6   7    8       2   4   4   8   8   8   8


1   1   1
2   2   2


1+31 1 I,

     (2)


and
                         163 816   s >s+ 1  +3( )+      8   1+4(i)
                      16(                              16

Continuing in this pattern, we can see that, in general,


          m
s2m > 1+
          2


(1.3.14)


for any m =0,1, 2,.....Thus, since 2 may be made arbitrarily large, the sequence {ss}
does not have an upper bound. Hence we must have


lim s = 00,
12--- 0


(1.3.15)


and so the harmonic series does not have a sum.
    Although the partial sums of the harmonic series diverge to infinity, they grow very
slowly. For example, if n = 500, 000, 000, then sn is between 20 and 21. That is,


         11                1
20<1+-+ - +. +  - -+<21.
         2   3        500, 000, 000


(1.3.16)


Problems

1. Find the sum of each of the following infinite series which has a sum.


    co   1n-1
(a)Z () 1
    n-1
      02

    n=1
     o)n  n-1
(e)n_  7 3/ 5


(b) 4(0.21) --1
    n=1
       °2
(d)Z7
    n=0
    o0      n

  (f)


﻿


10


The Sum of a Sequence


Section 1.3


     00      n-1

     n-1

   (i) 1.00001"
   n=1
        00 91n-1
 (k) Zsin(\)


(m)     sin(7rn)
    n=1i


(h)    0.99999n
    n=1
         °°3n
 (j)    5
    n-30   7  -
    00

 (1)21

    n=1
    00
(n)    cos(7rn)
    n=1


2. Consider the infinite series with nth partial sum


             1   1        1
s2=1+1+-+ 6---+


j=0


Note that, by definition, 0! = 1.

(a) Show that s1 <3 for all values of n. Hint: Note that

                    n! = (1)(2)(3) ... (n) > (1)(2)(2) ... (2) - 21

    for n = 1, 2, 3, ....
(b) Combine (a) with the fact that sn+1 > sa for all n to conclude that


    exists and is less than 3.
(c) In fact,
                              1


        1 1  1
1+1+-+-+           +...
       2    6   24


    is a well known irrational number. Add up a sufficient number of terms to enable
    you to guess the value of the sum. How many terms did it take?
(d) How many terms are necessary to obtain a partial sum that is within 0.000001 of
    the sum?


3. The sum


    0(-1)2
4
     2n+ 1
  n=0


4 (1


1   1
3 5


1 l


is a well known irrational number.

(a) Add up a sufficient number of terms to enable you to guess the value of the sum.
    How many terms did it take?


﻿


Section 1.3


The Sum of a Sequence


11


   (b) How many terms are necessary to obtain a partial sum that is within 0.01 of the
       sum?
4. This problem outlines an alternative method for deriving the result of (1.3.8). Suppose
   r # 1. For n = 1,2,3,..., let

                              s12= 1+ r+ r2 + -  -   l-+r-1

   Show that sn - rs = 1 - r' and conclude that

                                      sn =1-r*
                                            1-r

5. Using the model we used for the multiplier effect, find the total amount of new spending
   resulting from each of the following.
   (a) The government spends 2 billion dollars; each recipient spends 80% of what he or
       she receives.
   (b) The government spends 250 million dollars; each recipient spends 95% of what he
       or she receives.
   (c) The government spends A dollars; each recipient spends 100r%, 0 < r < 1, of what
       he or she receives.
6. Government regulations specify that a bank may not loan 100% of its deposits; the
   bank must keep a certain percentage of its deposits in reserve. For example, if a bank
   must keep 15% of its deposits in reserve, then it may loan out $850 from a $1000
   deposit. Typically, this $850 will again be deposited in a bank, and that bank may
   loan out 85% of it. Again, this money will be deposited and 85% of it given out in
   loans. As this will continue indefinitely, the multiplier effect comes into play and the
   total amount of money in all the deposits resulting from the initial $1000 deposit can
   be computed in the same manner as in our example.
   (a) Compute the total amount of the deposits resulting from the initial $1000 deposit.
   (b) How would the answer in (a) change if the reserve rate was changed from 15% to
       20%?
   (c) How would the answer in (a) change if the reserve rate was changed from 15% to
       10%?
7. A ball is dropped from a height of 10 meters. Suppose that every time it strikes
   the ground, it bounces back to a height which is 75% of the height of the previous
   bounce. Assuming an infinite number of bounces (again, an idealized mathematical
   model), how far does the ball travel before it comes to rest? What would happen if it
   rebounded to only 25% of its initial height?
8. Suppose the animal in the our final example above could not produce offspring for its
   first 3 years of life. How would this change the probability of a female's reproducing
   before dying?


﻿


12


The Sum of a Sequence


Section 1.3


9. Suppose the animal in our final example above has only an 80% chance of living through
    a given year. How does this change the probability of a female's reproducing before
    dying?
10. Suppose a female animal of the type discussed in the final example above has a 100r%,
    0 < r < 1, chance of reproducing each year after its first year.
    (a) Find the probability P of a female's reproducing before dying.
    (b) Plot P as a function of r for 0 < r < 1.
    (c) Find the value of r for which P = 0.5.
11. How many terms of the harmonic series are needed to obtain a partial sum larger than
    5? How many terms are needed to obtain a partial sum larger than 10?
12. Plot the points (n, sn), where sn is the nth partial sum of the harmonic series, for
    n = 1, 2, 3, ... ,1000. What does this show you about the rate of growth of the partial
    sums?
13. The first example of this section is a particular case of the more general problem of
    computing probabilities associated with the waiting time for some event to occur. As
    another example, suppose that an electronic switch works with probability p and fails
    with probability q = 1 - p. Then, using reasoning analogous to that used in the coin
    tossing example, the probability that the first failure will occur on the nth use of the
    switch is p-lq, n = 1, 2, 3,....
    (a) Can you justify this probability?
    (b) The reliability of the switch is given by the function

             R(n) = probabilty that the switch does not fail until after the nth use
                      oo
                    - pJ--lq
                    j=n+1

        Show that R(n) = pm, n = 1, 2, 3, ....
    (c) Find a way to show that R(n) = p directly without using an infinite series.


﻿


Section 1.4


       Differential Equations         Difference Equations


At this point almost all of our sequences have had explicit formulas for their terms. That
is, we have looked mainly at sequences for which we could write the nth term as an = f(n)
for some known function f. For example, if

                                    an = n2 + 3'

then it is an easy matter to compute explicitly, say, ao = 10 or a100 = 113. In such
cases we are able to compute any given term in the sequence without reference to any
other terms in the sequence. However, it is often the case in applications that we do not
begin with an explicit formula for the terms of a sequence; rather, we may know only
some relationship between the various terms. An equation which expresses a value of a
sequence as a function of the other terms in the sequence is called a difference equation.
In particular, an equation which expresses the value an of a sequence {an} as a function of
the term an_1 is called a first-order difference equation. If we can find a function f such
that an = f (n), n = 1, 2, 3, ..., then we will have solved the difference equation. In this
section we will consider a class of difference equations that are solvable in this sense; in
the next section we will discuss an example where an explicit solution is not possible.
Example Suppose a certain population of owls is growing at the rate of 2% per year. If
we let xo represent the size of the initial population of owls and xn the number of owls n
years later, then
                            on+1 = xn + 0.02xn = 1.02xn                       (1.4.1)

for n = 0, 1, 2,.... That is, the number of owls in any given year is equal to the number
of owls in the previous year plus 2% of the number of owls in the previous year. Equation
(1.4.1) is an example of a first-order difference equation; it relates the number of owls in
a given year with the number of owls in the previous year. Hence we know the value of a
specific xn once we know the value of xn_1. To get the sequence started we have to know
the value of xo. For example, if initially we have a population of xo = 100 owls and we
want to know what the population will be after 4 years, we may compute

                       x1 = 1.02xo = (1.02)(100) = 102,
                       x2 = 1.02x1 = (1.02)(102) = 104.04,
                       x3 = 1.02x2 =(1.02)(104.04) = 106.1208,

and
                     x4 =1.02x3 =(1.02)(106.1208) = 108.243216.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Difference Equations


Section 1.4


                  700


                  600


                  500


                  400


                  300                         ..*


                  200


                  100,****


                              20       40       60       80       100

     Figure 1.4.1 Plot of (n,, ,a), n = 0, 1, 2, ...., where x0= 100 and xn+1 = 1.02x,

Thus we would expect about 108 owls in the population after 4 years. Note that although
it is not possible to have a fractional part of an owl, it is nevertheless important to keep
the fractional part in intermediary calculations.
    We may work backwards to find x4 explicitly in terms of x0:

                            X4 = 1.02x3
                                 (1.02) (1.02)x2
                                 (1.02) (1.02) (102)xi
                                 (1.02) (1.02) (1.02) (1.02)xo
                                 (1.02)4xo.
This is interesting because it indicates that we can compute x4 without reference to the
values of xi, X2, and X3, provided, of course, that we know the value of xo. If we do this
in general, then we have solved the difference equation xn+1 = 1.02x,. Namely, we have,
for any n= 1, 2, 3, ...,
             o,,= 1.02x,_1 = (1.02)2x,-2 = (1.02)3x,-3 = - - - = (1.02),,xo.    (1.4.2)
For example, if zo0 100 as above, then we can compute

                              w20 =(1.02)20 (100) ~149,
or even
                            wiso   (1.02)150 (100) ~1, 950,
without having to compute any intermediate values.


﻿


Section 1.4


Difference Equations


3


    For a geometric feeling of how the population is changing with time, Figure 1.4.1 shows
a plot of the points (n, cn) for n = 0, 1, 2,... 100. Of course, whether or not our model
will provide an accurate prediction of the owl population 100 or 200 years into the future
is an entirely different question. Frequently, a simple population model like this will be
valid only for a short span of time during which the rate of growth of population remains
stable.
    By replacing 1.02 with an arbitrary constant a in (1.4.2), we arrive at the general
result that the solution of the difference equation

                                     -n+1 = ax,                                 (1.4.3)

n = 0, 1, 2, . . ., is given by
                                      on =a nzo,                                (1.4.4)

n = 0, 1, 2, .... Note that this difference equation, and its solution, are useful whenever we
are interested in a sequence of numbers where the (n + 1)st term is a constant proportion
of the nth term. Our first example, where a population was assumed to grow at a constant
rate, is a common example of this type of behavior. Another common example is when
a quantity decreases at a constant rate over time. This behavior is discussed in the next
example in the context of radioactive decay.
Example Radium is a radioactive element which decays at a rate of 1% every 25 years.
This means that the amount left at the beginning of any given 25 year period is equal
to the amount at the beginning of the previous 25 year period minus 1% of that amount.
That is, if xo is the initial amount of radium and xn is the amount of radium still remaining
after 25n years, then
                             on+1 = on - 0.01xn = 0.99xn                        (1.4.5)

for n = 0, 1, 2,.... Since this is a difference equation of the form of (1.4.3) with a = 0.99
we know that the solution is of the form (1.4.4). Namely,

                                    on = (0.99)"xo

for n = 0, 1, 2,.... For example, the amount left after 100 years is given by

                              x4 = (0.99)4xo = 0.9606x0,

where we have rounded the answer to four decimal places. That is, approximately 96% of
the initial amount of radium will be left after 100 years. A plot of the amount of radium
left versus number of years, assuming an initial amount of 500 grams, is given in Figure
1.4.2.
    The half-life of a radioactive element is the number of years required for one-half of
an initial amount to decay. Suppose that, for this example, N is the smallest integer for
which WN is less than one-half of the initial amount of radium. This would mean that


﻿


4


Difference Equations


Section 1.4


                  500 ;


                  400


                  300


                  200


                  100


                             2000     4000     6000     8000    10000

             Figure 1.4.2 Plot of amount of radium versus number of years


which implies that
                                     1
                                     - > (0.99)N.
                                     2-
Taking logarithms, we have

                             log10      > log10 ((0.99)N)

which implies that
                              log10 ()> Nlog10(0.99).

Solving for N, and remembering that logio(0.99) < 0, we have

                                    log10i(0
                               N >              =og1(n.99) 68.98,

rounding to two decimal places. Hence, since N must be an integer, we have N =69.
Recalling that we are working with 25 year units of time, this shows that the half-life of
radium is approximately (25) (69) =1725 years. For example, this means that if we started
with an initial amount of 100 grams of radium, after 1725 years we would still have 50
grams left. It would then take an additional 1725 years until the remaining amount would
be reduced to 25 grams.


﻿


Section 1.4


Difference Equations


5


    Although we have stated the results of the preceding example in discrete time units,
namely, units of 25 years each, later we will see that the results hold for continuous time
as well. In other words, although the difference equation (1.4.5) has been set up for
nonnegative integer values of n, the solution (1.4.6) is valid for arbitrary nonnegative values
of n. We will hold off discussion of these ideas until we consider differential equations, the
continuous time versions of difference equations, in Chapter 6.
    It is interesting to compare the plots in Figures 1.4.1 and 1.4.2. The first is an example
of exponential growth, whereas the second is an example of exponential decay. In the first,
the steepness of the graph increases with time; in the second, the graph flattens out over
time. The difference equation (1.4.3) will always lead to the first behavior when a > 1 and
to the second when 0 < a < 1.

First-order linear difference equations
Given constants a and 0, a difference equation of the form

                                    an+1 = axz + /3,                              (1.4.6)

n = 0, 1, 2, ..., is called a first-order linear difference equation. Note that the difference
equation (1.4.3) is of this form with 13= 0. A procedure analogous to the method we used
to solve (1.4.3) will enable us to solve this equation as well. Namely,

                      -n axa- + /3
                      = a(axn-2 +/3) +/3
                      = a2xn-2 + 3(a +1)
                      = a2(axa-3 + /) + /(a + 1)
                      = a3xn-3 +3(a2 + a +1)


                      = anxo + /(a"-+ acn-2 + ...+ a2 +a+ 1).

Note that if a = 1, this gives us
                                     n = Xo + n/3,                                (1.4.7)
n = 0,1, 2, ..., as the solution of the difference equation xn+1 =z +/3. If a # 1, we know
from Section 1.3 that

                                         ..               1 -  av

Hence


n = 0, 1, 2, ..., is the solution of the first-order linear difference equation xn+1 =axa +#/
when a # 1.


﻿


6


Difference Equations


Section 1.4


    We have seen examples of first-order linear equations in the population growth and
radioactive decay examples above. Another interesting example arises in modeling the
change in temperature of an object placed in an environment held at some constant tem-
perature, such as a cup of tea cooling to room temperature or a glass of lemonade warming
to room temperature. If To represents the initial temperature of the object, S the constant
temperature of the surrounding environment, and Tn the temperature of the object after
n units of time, then the change in temperature over one unit of time is given by

                               Tn+1 - Tn = k(Tn - S),                          (1.4.9)

n = 0, 1, 2, ..., where k is a constant which depends upon the object. This difference equa-
tion is known as Newton's law of cooling. The equation says that the change in temperature
over a fixed unit of time is proportional to the difference between the temperature of the
object and the temperature of the surrounding environment. That is, large temperature
differences result in a faster rate of cooling (or warming) than do small temperature differ-
ences. If S is known and enough information is given to determine k, then this equation
may be rewritten in the form of a first order-linear difference equation and, hence, solved
explicitly. The next example shows how this may be done.
Example Suppose a cup of tea, initially at a temperature of 180°F, is placed in a room
which is held at a constant temperature of 80°F. Moreover, suppose that after one minute
the tea has cooled to 175°F. What will the temperature be after 20 minutes?
    If we let Tn be the temperature of the tea after n minutes and we let S be the temper-
ature of the room, then we have To = 180, T1 = 175, and S = 80. Newton's law of cooling
states that
                               Tn+1 - Tn = k(T, - 80),                        (1.4.10)

n = 0, 1, 2, ..., where k is a constant which we will have to determine. To do so, we make
use of the information given about the change in the temperature of the tea during the
first minute. Namely, applying (1.4.10) with n = 0, we must have

                                T1- To    k(T0 - 80).

That is,
                               175 - 180 = k(180 - 80).

Hence
                                     -5 = 100k,
and so
                                    k =-    =-0.05.
                                       100
Thus (1.4.10) becomes

                        Tn1- T- -0.05(Tn - 80) - -0.05Tn + 4.

Hence


Tn+1 = T - 0.05Tn + 4 = 0.95Tn + 4


(1.4.11)


﻿


Section 1.4


Difference Equations


7


                   200

                   175

                   150

                   125

                   100

                   75

                   50

                   25


                              10     20      30     40      50      60

    Figure 1.4.3 Tea temperature decreases asymptotically toward room temperature


for n = 0, 1, 2,.... Now (1.4.11) is in the standard form of a first-order linear difference
equation, so from (1.4.8) we know that the solution is

                                                (1- (0.95)<
                          Tn = (0.95)"(180) +4
                                                 1 - 0.95}
                             = 180(0.95)n + 80(1 - (0.95)n)
                             = 80 + 100(0.95)n

for n = 0, 1, 2, .... In particular,

                            T2o = 80 + 100(0.95)20 = 115.85,

where we have rounded the answer to two decimal places. Hence after 20 minutes the tea
has cooled to just under 116°F. Also, since

                                    lim (0.95)n= 0,

we see that
                          lim Ta    lim (80 + 100(0.95)Th)= 80.                 (1.4.12)

That is, as we would expect, the temperature of the tea will approach an equilibrium
temperature of 800F, the room temperature. In Figure 1.4.3 we have plotted temperature
  Toversus time nt for nt - 0, 1, 2, . .. , 60, along with the horizontal line T - 80. As indicated
by (1.4.12), we can see that Ta decreases asymptotically toward 800F as nt increases.


﻿


8


Difference Equations


Section 1.4


Problems

1. Compute the next five terms of each of the following sequences from the given infor-
    mation.

    (a) xo = 10, xn+1 = xn + 4                (b) Yo = -1,Yn+1

    (c)xo = 40, xn+1 =2xn - 20           (d) zo = 2, zn+1 = zn - zn
                                                                 1
    (e) zo = 2, x 1 = 3, on+2 = on+1 + on  (f) xo = 15, on = Sn-1 + 2
                                                                 3
 2. Solve the following difference equations with the given initial condition. Use your
    solution to find x10.
                                                          3
    (a) xn+1 = 2xn, xo = 5                    (b) xn+1 = = gn, x0 = 100

    (c) xn+1 = 1.8xn + 10, xo = 20        (d) 4xn+1 - 2xn = 12, xo = 6
    (e) on+1 - on = 3x + 4, zo = 2         (f) 5xn+1 - 3x = 2xn+1 - on, zo = 100

 3. A population of weasels is growing at rate of 3% per year. Let wn be the number of
    weasels n years from now and suppose that there are currently 350 weasels.
    (a) Write a difference equation which describes how the population changes from year
       to year.
    (b) Solve the difference equation of part (a). If the population growth continues at the
       rate of 3%, how many weasels will there be 15 years from now?
    (c) Plot wn versus nfor n = 0, 1, 2, ... , 100.
    (d) How many years will it take for the population to double?
    (e) Find lim wn. What does this say about the long-term size of the population?
             1- 0
       Will this really happen?

 4. If the rate of growth of the weasel population in Problem 3 was 5% instead of 3%, how
    many years would it take for the population to double?

 5. Suppose that the weasel population of Problem 3 would grow at a rate of 3% a year if
    left to itself, but poachers kill 6 weasels every year for their fur.

    (a) Write a difference equation which describes how the population changes from year
       to year.
    (b) Solve the difference equation of part (a). How many weasels will there be in 15
       years?
    (c) Find lim wn. What does this say about the long-term size of the population?

    (d) Will the population eventually double? If so, how long will this take?
    (e) Plot wn versus n for n =0,1, 2,. . .,100.


﻿


Section 1.4


Difference Equations


9


6. Suppose that the weasel population of Problem 3 would grow at a rate of 3% a year if
    left to itself, but poachers kill 15 weasels every year for their fur.
    (a) Write a difference equation which describes how the population changes from year
        to year.
    (b) Solve the difference equation of part (a). How many weasels will there be in 15
        years?
    (c) Find lim wn. What does this say about the long-term size of the population?
             () -W 0o
    (d) Will the population eventually double? If so, how long will this take
    (e) Will the population eventually die out? If so, how long will this take?
    (f) Plot wn versus n for n = 0, 1, 2, ... , 100.
 7. A radioactive element is known to decay at the rate of 2% every 20 years.
    (a) If initially you had 165 grams of this element, how much would you have in 60
        years?
    (b) What is the half-life of this element?
    (c) Suppose that the bones of a certain animal maintain a constant level of this element
        while the animal is living, but the element begins to decay as soon as the animal
        dies. If a bone of this animal is found and is determined to have only 10% of its
        original level of this element, how old is the bone?
 8. Repeat Problem 7 if the element decays at the rate of 3% every 10 years.
 9. A cup of coffee has an initial temperature of 165°F, but cools to 155°F in one minute
    when placed in a room with a temperature of 70°F. Let Tn be the temperature of the
    coffee after n minutes.
    (a) Write a difference equation, in standard first order linear form, which describes the
        change in temperature of the coffee from minute to minute.
    (b) Solve the difference equation from part (a).
    (c) Find the temperature of the coffee after 25 minutes.
    (d) Find lim Tn.
             n- oo
    (e) Plot Tn versusn for n = 0, 1, 2, ....120.
    (f) Does the temperature ever reach 70 F?
10. A glass of lemonade, initially at a temperature of 42°F, is placed in a room with
    a temperature of 78°F. If the lemonade warms to 45°F in 30 seconds, what will its
    temperature be in 10 minutes?
11. An iron ingot, heated to a temperature of 3000C, is placed in a liquid bath held at a
    constant temperature of 900C. If the ingot cools to 2500C in two minutes, what will
    its temperature be in 20 minutes?
12. A glass of ginger ale is left in a room. Initially, the ginger ale has a temperature
    of 450F, but after one minute the temperature has increased to 500F and after two
    minutes it has increased to 540F. What is the temperature of the room?


﻿


10


Difference Equations


Section 1.4


13. In his book Liber Abaci (Book of the Abacus), Leonardo of Pisa, also know as Fibonacci,
    posed the following question: How many pairs of rabbits will be produced in a year,
    beginning with a single pair, if in every month each pair bears a new pair which
    becomes productive from the second month on? (See A History of Mathematics by
    Carl B. Boyer, Princeton University Press, 1985, page 281).
    (a) Let fm be the number of pairs of rabbits in the nth month. Explain why fi = 1
        and f2 =1.
    (b) Explain why f±n+2 = f+i + fm for n = 1, 2, 3, ....
    (c) Compute fm for n = 3, 4, 5, 6, 7, 8 by hand.
    (d) Compute fTh for n = 1, 2, 3, ... ,100.
    (e) What is lim fn?
                n-00
    (f) Compute
                                                fn
                                                fn+1
        for n = 1,2,3, ... , 100. Do you think lim rn exists? If so, what is a good approx-
        imation for this limit to five decimal places?
    (g) Show that
                                                  1
                                       rn+ 1 -
                                               1 + Tn

    (h) Using (g) and assuming that lim rn exists, show that

                                                 5-1
                                      lim r
                                      n- oo2

       the golden section ratio.
14. Given x0= 0 and x1o = 20, show that Xn = 2n satisfies the difference equation

                                         Xn-1 + Xn+1
                                    Xn         2

    for n = 1, 2, 3,... , 9. This difference equation is a discrete model for the equilib-
    rium heat distribution along a a straight piece of wire running from 0 to 10 with the
    temperature at 0 held at 00 and the temperature at 10 held at 20°.
15. How would the solution to Problem 14 change if we changed the boundary conditions
    to xo =10 and xio0   50?
16. An approximate solution of a two-dimensional version of the model in Problem 14 may
    be found using a spreadsheet. For example, you might set cells A1-A20 and H1-H20
    equal to 10 and cells Bi-Gi and B20-G20 equal to 0. This would represent a flat
    rectangular piece of metal with the temperature along the vertical sides held fixed at
    10~ and the temperature along the horizontal sides held fixed at 00. Now set the value
    of every cell inside the rectangle to be equal to the average of the values of its four


﻿


Section 1.4                           Difference Equations                            11

    neighboring cells. For example, you would put the formula (A2+C2+B1+B3)/4 in cell
    B2 and then copy this cell to all the cells in the block from B2 to G19. Now have
    the spreadsheet repeatedly compute the values of the cells until they stabilize (that
    is, until they no longer change values when you recompute). If you format the cell
    values so that they are all integers, this should not take too long. What you have now
    is the equilibrium heat distribution for the metal plate. Now try different boundary
    conditions to obtain different equilibrium heat distributions.


﻿


Section 1.5


       Differential Equations         Nonlinear Difference Equations


In Section 1.4 we discussed the difference equation

                                    n+1a,                                     (1.5.1)

n = 0, 1, 2, ..., as a model for either growth or decay and we saw that its solution is given
by
                                     xn   anzo
n = 0, 1,2,.... Now
                                        0,   0 < a < 1,
                             lim an =   1,   a = 1,                           (1.5.2)
                                        0o, a > 1,
from which it follows that if {xn} is a solution of (1.5.1) with xo > 0, then

                                               0,   0<a<1,
                      lim xn = xo lim an =     xo, a= 1,                      (1.5.3)
                                               oo, a >1.

These limiting values are consistent with our radioactive decay example since, in that case,
0 < a < 1 and we would expect the amount of a radioactive element to decline toward
0 over time. The case 0 < a < 1 also may make sense for a population model if the
population is declining and heading toward extinction. However, the unbounded growth
indefinitely into the future implied by the case a > 1 is very unlikely for a population
model: eventually ecological or even sociological problems come to the forefront, such as
when the population begins to overreach the resources available to it, and the rate of
growth of the population changes. Even for bacteria growing in a Petri dish, diminishing
food and space eventually cause a change in the rate of growth. Hence the equation

                                    on+1 = axz,                               (1.5.4)

for n = 0, 1, 2, ... and a > 1, called the uninhibited, or natural, growth model, although
often accurate as a model of population growth over short periods of time, is usually too
simplistic for predictions over long time spans.

The inhibited growth model
Suppose we wish to model the growth of a certain population which, without ecological
constraints, would grow at a rate of 100/3% per unit of time. That is, if xn represents the


1


Copyright @ by Dan Sloughter 2000


﻿


2


Nonlinear Diffference Equations


Section 1.5


size of the population after n units of time and there are no constraints on the size of the
population, then
                                   on+1 - o   = x                                 (1.5.5)
for n = 0, 1, 2,.... However, suppose that, because of the limitation of resources, the
population will begin to decline if it ever has more than M individuals. We call M the
carrying capacity of the available resources, the maximum population which is sustainable
over time. Then it would be reasonable to modify our model by forcing the amount of
increase over a unit of time to decrease as the size of the population approaches M and
to become negative if the size of the population ever exceeds M. One way to accomplish
this is to multiply the term /x3 in (1.5.5) by

                                        M -n
                                          M

a ratio which is close to 1 when xn is small, close to 0 when xn is close to M, and negative
when xn exceeds M. This leads us to the difference equation

                                               (M-xmn
                              xn+1 - xn n M '

n = 0, 1, 2, ..., or, equivalently,

                              xn+1 =   n +    xn(M - cn),                         (1.5.6)
                                           M

n = 0, 1, 2, ..., which we call the inhibited growth model, also known as the discrete logistic
equation. This is an example of a nonlinear difference equation because if we multiply
out the right-hand side of the equation we have a quadratic term, namely, -m-xn  Such
equations are, in general, far more difficult to solve than the linear difference equations
we considered in Section 1.4; in fact, many nonlinear difference equations are not solvable
in terms of the elementary functions of calculus. Hence we will not consider any methods
for solving such equations, relying instead on computing specific solutions by iterating the
equation using a calculator or, preferably, a computer.
Example Suppose a population of owls, currently numbering 100, has a natural growth
rate of 4%, but, because of the limited resources of their natural habitat, can sustain a
population of no more than 500. If we let xn represent the size of the population n years
from now, then, using the inhibited growth model, we should have

                            0.04
               on+1 =xo + xn(500 - ccx) =xm + 0.00008xn(500 - ccx)
                            500

for nf= 0, 1, 2,...Using this equation we are able to compute, for example, the predicted
size of the population for the next 10 years:

Year           0      1      2     3      4      5      6     7      8      9     10
Population 100.0 103.2 106.5 109.8 113.3 116.8 120.3 124.0 127.8 131.5 135.4


﻿


Section 1.5


Nonlinear Difference Equations


3


                   500

                   400

                   300

                   200

                   100


                            20    40    60    80    100   120  140
                Figure 1.5.1 Inhibited population growth with 3= 0.04


Here, and in subsequent tables, we have rounded our results to the first decimal place. It is
interesting to compare these results to the corresponding results for the uninhibited growth
model. If we let y, be the predicted population n years from now using the uninhibited
growth model, then we would have

                             yn+1 = yn + 0.04y = 1.04yn ,

n = 0, 1, 2, ..., which has the exact solution

                                    ya = 100(1.04).

for n = 0, 1, 2,.... From this model we obtain the following predicted population sizes:

Year           0      1     2      3      4     5      6      7     8      9     10
Population 100.0 104.0 108.2 112.5 117.0 121.7 126.5 131.6 136.9 142.3 148.0

As we would expect, the population is growing more slowly under the inhibited population
growth model than under the uninhibited model. Moreover, this difference will become
more pronounced over time. For example, after 150 years we would have x150= 495.4 and
Y15o = 35, 892, illustrating how the inhibited growth model is constrained by the carrying
capacity of 500 while the uninhibited growth model will have unbounded growth. Figures
1.5.1 and 1.5.2 provide a graphical comparison of the two models for n = 0, 1,2, ... , 150.
Note that it appears that
                                     lim xn = 500,
                                     12- 00
while
                                      lim ya = 00.

    With the inhibited growth model, if 0 </3 < 1 and xn < M, then


                                         M


﻿


4


Nonlinear Diffference Equations


Section 1.5


35000
30000
25000
20000
15000
10000


5000


              20    40    60    80   100   120   140

Figure 1.5.2 Uninhibited population growth with 0/= 0.04


so


1n+1 =W1 + # i" (M
              M


fn) < on + (M


. n) = M


(1.5.7)


for n = 0, 1, 2,.... Thus if0 < # < 1, and we start with zo < M, then xn < M for all n.
Moreover, since
                                       /3   > 0,
                                         M
we have
                           -n+1 = z + /3x (M - xn) > on
                                          M
for all n. Hence the sequence {x} is monotone increasing and bounded, and so must have
a limit. In Problem 8 you will be asked to verify that this limit is in fact M, as appeared
to be the case in the previous example.
    If /3> 1, it may be the case that there are values of n for which xn > M, in which case

                                   /x '(M - xn) < 0
                                   M


and, as a consequence, 1n+1 <  .


Example Suppose zo
/ = 1.5. That is,


100 and M


500 as in the previous example, but now let


                on+1 =   n + 1.5 on(M - n) = xn + 0.003x (500 - on)
                             500

for n = 0, 1, 2,.... This equation generates the following values:

Year           0      1     2      3      4     5      6      7     8      9     10
Population 100.0 220.0 404.8 520.4 448.5 505.3 497.2 501.4 499.3 500.3 499.8


﻿


Section 1.5


Nonlinear Difference Equations


5


500

400

300

200

100


10       20       30       40       50


Figure 1.5.3 Inhibited population growth with 13


1.5


Note how the values increase rapidly (as we should expect with such a large value for /3)
to above the carrying capacity of 500, but then oscillate about 500, with the oscillations
diminishing in size. In fact it may be shown that it is also true in this case that

                                     lim xn = 500.
                                     n- oo

See Figure 1.5.3.

    It is possible to show that, for the inhibited growth model of (1.5.6),

                                      lim xn = M
                                      n- oo

whenever 0 < /3 < 2. However, there are other possible behaviors when /3> 2.


Example Suppose xo
3 = 2.3. That is,


100 and M


500, as in the previous examples, but now let


                n+1 =    n+    + xn(M - xn) =  n + 0.0046xn(500 - xn)
                             500

for n = 0, 1, 2, .... This equation generates the following values:


Year
Population

Year
Population


  0     1      2      3     4      5      6      7     8      9     10
100.0 284.0 566.2 393.8 586.2 353.8 591.7 342.0 590.6 344.5 590.9

11      12    13     14     15     16    17     18     19    20
343.8 590.8 344.0 590.9 343.9 590.8 343.9 590.8 343.9 590.8


Notice that instead of approaching a single limiting value, the population is settling down
to an oscillation between 344 and 591. We say that the sequence {xn} is approaching a
limiting cycle of period 2, as shown in Figure 1.5.4.


﻿


6


Nonlinear Diffference Equations


Section 1.5


600

500

400

300

200

100 a


10       20        30       40       50


Figure 1.5.4 Inhibited population growth with /3


2.3


600

500

400

300

200

100


10       20        30       40       50


Figure 1.5.5 Inhibited population growth with 03


2.48


    It is possible to obtain limiting cycles of longer periods by increasing 0. For example,
Figure 1.5.5 shows the effect of letting /3 2.48. Note that {xn} appears to be approaching
a limiting cycle of period 4.
    With appropriate choices for / and xo, it is in fact possible for the inhibited growth
model to exhibit limiting cycles of any given period. This is related to the fact that it is
possible for this model to behave chaotically. Intuitively, a sequence is chaotic if it displays
erratic behavior which, although in theory completely determined by a difference equation
such as (1.5.6), is in practice unpredictable because small changes in the initial value
xo yield strikingly different sequences. For example, Figures 1.5.6 and 1.5.7 illustrate
the differing behavior of the inhibited growth model with /3 2.95, first for an initial
population of 100 and then for an initial population of 101.


﻿


Section 1.5


Nonlinear Difference Equations


7


                    700

                    600                               *

                    500

                    400

                    300

                    200

                    100    * **

                               10       20       30      40       50

          Figure 1.5.6 Inhibited population growth with 13= 2.95 and xo = 100


                    700 -

                    600

                    500                         *

                    400

                    300

                    200

                    100

                               10       20       30      40       50

          Figure 1.5.7 Inhibited population growth with 13= 2.95 and xo = 101


Problems

1. A population of weasels has a natural growth rate of 3% per year. Let wn be the
    number of weasels n years from now and suppose there are currently 300 weasels.

    (a) Suppose the carrying capacity of the weasel's habitat is 1000. Using an inhibited
       growth model, write a difference equation which describes how the population
       changes from year to year.
    (b) Using the difference equation from part (a), compute wn for nr= 1, 2, ... ,150.
    (c) How many years will it take for the population to double? To triple?
    (d) Plot wn versus nt for nr= 0, 1, 2, . .. , 150. From the plot, guess lim wn.

    (e) Compare your answers with those to Problem 3 in Section 1.4.
 2. Suppose a population of northern pike in a lake in Montana has a natural growth rate
    of 4.5% per year, but the lake can support no more than 10,000 pike. Let pa be the


﻿


8


Nonlinear Diffference Equations


Section 1.5


   number of pike n years from now and suppose po = 1000.

   (a) Use the inhibited growth model to write a difference equation which describes how
       the population changes from year to year.
   (b) Using the difference equation from part (a), compute pm for n = 1, 2, 3, ... 50.
   (c) How many years will it take for the population to double? To triple?
   (d) Plot pn versus n for n = 0, 1, 2,... 200. From the plot, guess lim pn.

   (e) How many years will it take for the population to reach 9500?
3. Do Problem 2 assuming an uninhibited growth model and no restrictions on the number
   of pike that the lake can support.
4. Suppose rn represents the number of snowshoe rabbits in a certain National Forest in
   Alaska after n years with an initial value of ro = 5000. Moreover, suppose the forest
   can support no more than 10,000 rabbits and {rm}satisfies the inhibited growth model


                           rn+1 = rn +         rn(10, 000 - rn)
                                        10, 000

   for n = 0,1, 2,....For each of the following values for 0, plot rm versus n for n
   0, 1, 2, ... , 100 and comment on the behavior of the sequence, in particular noting any
   limiting values or limiting cycles

   (a) /3=0.5                                  (b) /3=1.5
   (c) /3 = 2.4                               (d) /3 = 2.5
   (e) /3= 2.56                                (f) #G= 2.9

5. Using an initial value of xo = 0.5, let {xm} be the sequence which satisfies the difference
   equation
                                   Xn+1 =ptXn(1 - n),
   n = 0,1, 2, .... Plot on versus n for the following values of p and comment on the
   behavior of the sequence, in particular noting any limiting values or limiting cycles.
   (a) yp= 0.9                                 (b) y = 1.0
   (c) yp= 1.5                                 (d) y = 2.0
   (e) y =2.5                                  (f) y =3.0
   (g) y = 3.1                                (h) y = 3.5
   (i) yf= 3.57                                (j) yf= 1+ 8/
   (k) y   3.99                                (1) yf= 4.0
6. Repeat Problem 5 starting with an initial value of xo =0.6.
7. If f is any function defined for real numbers, then the difference equation


n+1 = f (xn),


﻿


Section 1.5


Nonlinear Difference Equations


9


n = 0,1, 2, ..., is called a discrete dynamical system. For any given initial condition
co, the sequence {xn} which satisfies this equation is called an orbit of f. Note that
an orbit of f is simply the sequence of points

                        cco, f (co), f(f (o)), f (f (f(o))) ....

For example, the difference equation in Problem 5 is an example of a discrete dynamical
system with f(x) = px(1 - x). For each of the following, compute 50 terms of the
given orbit and discuss its behavior.


(a) zo = 10, f(x) = 2x
(c) zo = 2, f (x) = cos(x)


(g) co = 0, f (X) = x2 + 1.0
(i) c0 = 0, f (X) =X2 - 0.8
(k) co = 0, f (x) = x2 - 1.9


(b) co = 100, f (x) = 0.8x
(d) co = 2, f(x) = sin(x)
                     2cc2
(f) co = 1, f (X) =  2x5
                   3x2- 5
(h) co = 0, f (x) = x2 - 0.5
(j) co = 0, f(x) =x2 - 1.0
(1) c0 = 0, f(X) = x2 - 2.0


8. Assuming that the sequence {x} satisfying the inhibited growth model equation


                               n+  1=  n +   M   (      o/3
                                  cc~±i~c~+   cc(M -cc,)


has a limit, show that lim cn = M.
                      1- 00


﻿


Section 2.1


       Differential Equations         Functions And Their Graphs


Since functions are the basic building blocks out of which mathematicians construct models
of the physical world, it is essential that any student of mathematics have a firm grasp of
the concept. In particular, one must be careful to distinguish between a given function and
a notational or graphical representation for it. A function is a type of relationship, a mental
concept that cannot be seen or touched. Although pictures and symbolic representations
of a function are extremely important in understanding its behavior, the student must
always keep in mind the distinction between the function itself and its representations.
    Modern methods for giving a formal definition of a function, developed in the latter
part of the 19th century, are based on set-theoretic ideas. We will not go into the details
necessary to make such a precise definition, but rather aim at an intuitive understanding
of the basic concept. For us, a function is a special type of relationship between two
quantities. We often think of this relationship to be one of dependence. That is, if the
value of one quantity, say y, is determined by the value of another quantity, say x, then we
say that y is a function of x. For example, if x represents the height from which a certain
rock is dropped and y represents the velocity with which the rock strikes the ground, then
the value of y will depend on the value of x and we say that velocity y is a function of height
x. Note here that if y is the terminal velocity of the object, then there are many different
values of x which yield the same value of y, namely, any value of x which gives the object
sufficient time to reach its terminal velocity before striking the ground. On the other hand,
for a given value of x, there is only one related value of y. It is this latter property that
makes the relationship between height and impact velocity a function. For any quantities
represented by y and x, in order to say that y is a function of x we require that every
value of x be related to exactly one value of y. Such a relationship often arises through
some physical dependency, a cause creating a deterministic effect, but the definition does
not require such a link between the quantities in question. A number of examples should
help clarify this concept.
Example     Sequences are example of functions. That is, if {xn} is a sequence with n
1, 2, 3, ..., then every value of n determines exactly one value xn. For example, the area
of a regular polygon inscribed in a unit circle is a function of the number of sides. Also, a
difference equation, such as
                                   on+1 = 1.02xn,
n = 0, 1, 2, ..., makes xn a function of n. For example, the size of a certain population of
owls will be a function of the number of years from some starting date.
Example The area of a circle is a function of the radius of the circle.
Example The distance of the earth from the sun is a function of the time of year.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Functions And Their Graphs


Section 2.1


Example The temperature at a certain fixed point in space is a function of time.
    In mathematical terminology, if y is a function of x, then we call x the independent
variable and y the dependent variable. Also, the domain of this function is the set of
permissible values for x and the range is the set of all values of y which correspond to some
value of x.
Example Recall that the nth term of the sequence which gives the area of a regular
n-sided polygon inscribed in a unit circle is

                                         sin    27r
                                      As=- sn-       ,
                                         2      n

n = 3, 4, 5, ... The domain of this function is the set of integers {3, 4, 5, ...}. The range
can be specified only by saying that it is the set of numbers

                            {   sin (   )  n = 3, 4, 5, ...}.


    Before proceeding further, we should recall the notation for intervals of real numbers.
Given any real numbers a and b, we have

                                (a, b)={x|a <xz< b},                             (2.1.1)

                                (a, b]= {x| a <cxc<b},                           (2.1.2)
                                [a, b)= {x| a <x <b},                            (2.1.3)
                                [a, b] = {x| a <cx< b},                          (2.1.4)
                                  (a, oo) =_{x xc> a},                           (2.1.5)
                                  [a, oo) = {xI |x;>a},                          (2.1.6)
                                  (-oo,b) = {x x < b},                           (2.1.7)
and
                                 (-oo,b] = {x |xc<b}.                            (2.1.8)
Moreover, we call intervals of the form (2.1.1), (2.1.5), and (2.1.7) open intervals and
intervals of the form (2.1.4), (2.1.6), and (2.1.8) closed intervals.
Example If we let d specify the distance from the sun to the earth and t specify the
time of year, then the function that relates d and t has domain

                             {t |0<% t < 8760} =[0, 8760],

where t is specified in hours, and range

                           {d |91.4 < d   94.6} =[91.4, 94.6],


where d is specified in millions of miles.


﻿


Section 2.1


Functions And Their Graphs


3


    Before learning much about a specific function, a mathematician must represent the
function in some concrete form. This can be done in many ways. For example, we might
construct a table of values for the function. Such a table might have two rows, one for
values of the independent variable and one for the corresponding values of the dependent
variable. For example, if T is the temperature, in degrees Fahrenheit, at the Kalispell
airport weather station at time t, measured in hours past midnight, on August 3, 1999,
then our table might look like the following:

Time (t)             0      1     2      3     4      5      6     7      8      9
Temperature (T)     68     66    64     62     61    59     60     64    68     70
Time (t)            10     11     12    13     14     15    16     17    18     19
Temperature (T)     74     76    78     80     84    84     82     81    79     77
Time (t)            20     21    22     23     24
Temperature (T)     74     70    69     67     66

In other words, this table provides a complete listing of the values of the dependent variable
T which correspond to each value of the independent variable t.
    Of course, if the domain of the function contains a large number of points, it might not
be practical to represent the function using a table. Indeed, most of the functions which
we will consider in this course have an infinite number of points in their domain, rendering
complete representations using tables impossible. Moreover, even with a limited amount
of data, it is hard to understand much about the underlying function by looking at a table.
One alternative to a tabular representation of a function is a graphical representation. If y
is a function of x, the graph of this function is the set of all points in the Cartesian plane
with coordinates (x, y). If the domain of the function has only a finite number of points,
as in the preceding example, then its graph is just a set of points in the plane, as we see
in Figure 2.1.1. However, if we were able to plot this function for all values of t between
0 and 24, then its graph would become a curve passing through the points given by the
table. With the given data, we could approximate this curve by plotting the given points
and then connecting successive points by straight lines, as in Figure 2.1.2. In either form,
the graph gives a good pictorial representation of the function. From this picture, we can
easily identify such things as the high and low temperatures for the day, as well as the
time at which they occurred, or the time of day when the temperature was changing most
rapidly.
    It would be hard to overestimate the importance of graphs in studying functions; we will
in fact spend much time in this course considering graphs. However, the most concise, and
at the same time most complete, representation for a function is a formula which expresses
the values of the dependent variable in terms of values of the independent variable. For a
given function it may not be possible to find such a formula. For example, the function
which gives the temperature at the Kalispell airport for any given time during the day of
August 3, 1999, is not expressible by a formula; the only way we can compute values for
this function is to record the temperatures as they occur. On the other hand, for a circle
of radius r and area A, the formula A =r r2 gives us an explicit means for computing
values of the dependent variable A for any given value of the independent variable r. The


﻿


4


Functions And Their Graphs


Section 2.1


                    100

                    80-

                    60      *1*1.e .

                    40

                    20


                                5        10      15       20
                   Figure 2.1.1 Plot of temperature data for Kalispell


                   100

                     80


                     60

                     40

                     20


                                5       10       15       20
  Figure 2.1.2 Plot of temperature date for Kalispell with lines connecting data points


existence of a formula for a function enables us to perform mathematical computations
which, at best, could only be approximated otherwise. At the same time, it is important
to remember that a function is an abstract object; it is not itself a formula or a number or
a graph, but a relationship which exists between quantities specified by numbers. We need
to keep this in mind, even as we proceed to work more and more with functions through
their representations using formulas and graphs.

Example     If V represents the volume and r the radius of a sphere, then V is a function
of r and the formula
                                         V_43
                                           3
expresses this relationship. Note that the domain of this function is the open interval
(0, oc), even though negative values of r can be substituted into the formula without any
problems. This emphasizes that the function is determined by the underlying relationship
between V and r. Here we also have the range equal to (0, oo).


﻿


Section 2.1


Functions And Their Graphs


5


Example Suppose the quantity y is related to the quantity x by the formula

                                             1
                                         v/1 - X2'

Since, by convention, the square root notation refers to the positive square root of a given
number, this relationship makes y a function of x. If we are given no further information
about this function, then we should ascribe to it the largest possible domain and range.
In this case, the domain is the interval (-1, 1) (that is, values of x for which 1 - x2 is
positive) and the range is [1, oc) (that is, the possible results from dividing 1 by numbers
in the interval (0, 1]).
    At this point, we have used notation for the dependent and independent variables of
a function, but not for the function itself. As with variables, it is common to use letters
to designate functions. For example, we frequently use f to denote a function, in which
case f stands for the function itself, a relationship, while expressions like f (x), f(2), and
f(s) denote particular values of the function. That is f (x), f(2), and f(s) represent values
of the dependent variable which correspond to the values x, 2, and s, respectively, of the
independent variable.
Example The expression
                                      f(x)= 1

tells us that f represents a function which associates the value

                                           1
                                           .T2

to a given value x of the independent variable. Hence, for example,

                                              1
                                       f (2) = -,
                                              4,

                                      f (-1)=1,

                                      f(s)= 2

and
                                                1
                                  f(z+1)=
                                             (z+1)2
Note that the domain of f is {cc cc x 0}. That is, f is defined for every real number
except 0. The range of f is (0, oo).
Example Suppose S is the function which gives the temperature at the Kalispell airport
on August 3, 1990. If we measure time in of hours since midnight, then we know, for
example, that S(2) =64 and S(19) =77. However, if we let t represent the independent
variable for this function, namely, the number of hours since midnight, then we do not


﻿


6


Functions And Their Graphs


Section 2.1


have a general formula to express S(t). For example, we cannot compute S(7.5) or S(3.2),
let alone even consider what S(7) might be.
    It often happens that the output from one function is used as input for another function.
For example, suppose a pebble is dropped in a pond and the resulting circular wave has a
radius of 20t centimeters after t seconds. Then if r is the radius of the wave and A is the
area inside the wave, we have r = 20t and A = wr2. But the area inside the wave is also a
function of time, which may be expressed as

                                A = 7(20t)2 = 400wt2.

That is, the area of the circle is a function of the radius, which in turn is a function of time.
The function that we arrive at, namely, A as a function of t, is called the composition of
the two original functions. In the notation which uses letters to denote functions, we have
the following definition.
Definition If f and g are two functions, then the composition of f and g is the function
f o g whose value at x is given by

                                  f o g(x) = f(g(x)).                            (2.1.9)

Example     If f (x) =  and g(x) = x2 + 1, then

                       fog(x)=f(g(x))=f(x2+1) =           x2+1

and
                          g of (x) = g(f (x)) = g(/)=x + 1.
Note that f o g and g o f, as in this example, are not usually the same function.

Classes of functions
The simplest type of functions are those which involve only multiplication and addition.
In particular, functions of the form

                       p(x) = anzx + an_1xn-1 + ... + a1x + ao,                 (2.1.10)

where ao, a1,... , an are constants and n is a nonnegative integer, are called polynomials.
If a # 0, the degree of the polynomial is nt. For example,

                                 q(x) =3w2 - 13w + 3,

                                 f (t) =21t34 + 18t2 _
and
                                  g~) 1
                                  g()=- s3 + s + 12 -4s
                                      2


﻿


Section 2.1


Functions And Their Graphs


7


are polynomials, of degrees 2, 34, and 5, respectively, whereas

                                              3
                                       h(x)=-

is not a polynomial. In a sense polynomials are the building blocks for a large family of
important functions in calculus. In this regard, one of the major goals of this text is to
show how polynomials may be used to approximate more complicated functions.
    Functions which may be written in the form of a polynomial divided by a polynomial
are called rational functions. For example,

                                         3x2 - 4x +1
                                            x4 +1

and
                                       12        1
                               g(s) =s-+s2
                                       s    s2-3s+1
are both rational functions, the latter because it may be rewritten as a polynomial divided
by a polynomial if all the terms are put over the common denominator s(s2 - 3s + 1).
    The function f(x) =Vz is neither a polynomial nor a rational function because x is
raised to a power which is not an integer. Functions which permit addition, multiplication,
division, and rational numbers for powers are called algebraic functions. Thus, for example,

                                  g(t) =t + 2t2 - 3

is neither a polynomial nor a rational function, but is an algebraic function. Similarly,

                                 h(s) =ys2 + 3s + 2

is an algebraic function. We should note that every polynomial is also a rational function
and every rational function is also an algebraic function.
    Functions which are not algebraic are called transcendental. The trigonometric func-
tions are examples of transcendental functions. We shall discuss them in detail in the next
section.

Graphs of functions
We are now in a position to say more about the graphs of functions. With the notation we
have now, the graph of a function f is the set of all points (cc, f(x)) in the plane, where cM
is in the domain of f. For example, you should recall from previous work that the graph
of y cc x2is a parabola opening about the c-axis with its vertex at (0, 0). Also, you should
recall the shapes of the graphs of such functions as y =cc2, y =cc3, y =cc4, and y =cc5.
See Figures 2.1.3, and 2.1.4.
    Moreover, given a function f and a constant c, you may recall that the graph of
y =f(cc) + c is the graph of f shifted c units vertically (upward if c > 0 and downward if
c < 0), the graph of y =f(cc - c) is the graph of f shifted c units horizontally (to the right


﻿


8


Functions And Their Graphs


Section 2.1


-10     -5


5       10


-10     -5


5       10


5


5


Figure 2.1.3 Graphs of y = x2 and y


20
15
10
5


20
15
10
5


-10      -5


5


10


10      -5


5


10


Figure 2.1.4 Graphs of y = x3 and y :


if c >
about


0 and to the left if c < 0), and the graph of y
the x-axis. Hence, for example, the graph of


-f(x) is the graph of f reflected


                                      y =x2 -3

is a parabola, opening upward about the x-axis, with its vertex at (0, -3); the graph of

                                   y = (x + 2)2 - 3


is a parabola, opening upward about the line x
graph of


-2, with vertex at (-2, -3); and the


y =-(c + 2)2 + 3


is a parabola, opening downward about the line x
2.1.5.


2, with vertex at (-2, 3). See Figure


    Drawing graphs of functions whose basic shapes are not already known to us can be
a difficult problem. If the domain of a function f is finite, then drawing its graph is
only a matter of plotting some points in the plane. However, most of the functions we
will encounter in this course will have domains containing an infinite number of points;
drawing the graphs of such functions requires much more than plotting a few points. The


﻿


Section 2.1


Functions And Their Graphs


9


            10                            10                           10
            5                             5                             5

  -10   -5          5     10   -10   -5          5     10   -10   -            5    10
            -5                           -5                            -5
            -10                         -10                           -10

       Figure 2.1.5 Graphs of y = x2 - 3, y = (x + 2)2 - 3, and y = -(x + 2)2 + 3


problem is that no matter how many points we plot, we still do not know how the function
is behaving at the other points. For example, if we want to graph a function f on the
interval [0, 1], we might first plot the points

                   (0, f (0)), (0.1, f (0.1)), (0.2, f (0.2)), .. . , (1.0, f (1.0)).

Next, to guess at the behavior of the function between the plotted points, we might join
successive points by straight lines. Of course, this will only give us an approximation to the
true curve, the accuracy of which will depend on the actual behavior of the curve between
the plotted points, something about which we frequently have very little information. This
is similar to the problem we had with plotting the graph of a temperature function earlier.
However, here we can get help if we have a formula for f; for in that case we can try
plotting more points, say

               (0, f (0)), (0.05, f (0.05)), (0.10, f (0.10)),... , (1.00, f (1.00)),

or
               (0, f (0)), (0.01, f(0.01)), (0.02, f(0.02)),... , (1.00, f(1.00)).
If the graph of f is a reasonably smooth curve, we will be able to approximate it as well as
we like by plotting a sufficient number of points. This raises two questions: How do we know
that we have plotted a sufficient number of points? And, given that a sufficient number
of points will most likely be a large number, how do we actually plot them all? Of course,
the latter question is answered by using a computer. In fact, this approach to graphing a
function is unreasonable without access to a computer, or at least a calculator. Computers
also provide help in answering the first question. We start by plotting a reasonable number
of points, say 100 or so. If we have reason to doubt the accuracy of the resulting graph,
perhaps because the curve is not as smooth as we expected it to be, we can double the
number of points and plot it again. Because any computer, and in fact many calculators,
do this type of work rapidly, it is reasonable to plot the same function several times until
we are comfortable with the picture. In Section 3.9 we will learn how to use some of the
techniques of calculus to better understand the geometry of the graph of a function. This
will help us identify whether or not the output from a computer is an accurate depiction
of the graph.

Example Figure 2.1.6 compares the results of plotting the points


(0, f (0)), (0.1, f (0.1)), (0.2, f (0.2)),... , (1.0, f (1.0))


﻿


10


Functions And Their Graphs


Section 2.1


  1.5                                           1.5
    1                                            1
  0.5                                           0.5

              0.J0.6             08      1               02     0.     0.      0.1
 -0.5                                          -0.5
 -1                                             -1
 -1.5                                          -1.5

       Figure 2.1.6 Plot of y = sin(30x), using first 11 points and then 101 points


with plotting the points

                (0, f (0)), (0.01, f(0.01)), (0.02, f(0.02)), ... , (1.00, f(1.00))

(joining successive points with straight lines) for the function f(x) = sin(30x). Clearly,
the second plot a dramatic improvement over the first.
    When using a computer software package or a calculator to graph a function, there
are a couple of issues you should keep in mind. First, as we have just noted, plotting
a sufficient number of points is crucial to obtaining a good approximation of the graph.
Some programs will ask you to specify the number of points to plot, others will plot a
predetermined number of points, and still others will determine the number of points to
plot based on an estimate of the number of points necessary to provide an accurate picture
for the particular function. In the latter two cases it should still be possible to override
the program's decision and specify your own choice for the number of points to plot. If
plotting more points significantly changes the look of the graph, then you should be wary
of the original plot and consider plotting even more points. Second, the computer will plot
the function in a rectangle, called a window. The horizontal scale for this window will be
the interval over which you want to graph the function. The vertical scale may be chosen
by you or by the computer program. If possible, it is usually best first to let the program
choose the vertical scale for the window and then to adjust it as necessary to provide a
good picture of the graph. If the vertical scale is too small, you may miss part of the
graph; if the vertical scale is too large, the interesting features of the graph may be too
small to be visible. For example, Figure 2.1.7 shows the graph of y = sin(x) on the interval
[-4, 4], first with the vertical window scale being the interval [-0.5, -0.5] and next with
the vertical scale changed to [-20, 20]. Certainly, a vertical window scale on the order of
[-1.5, 1.5], as shown in Figure 2.1.8, is a more appropriate choice for this graph.


﻿


Section 2.1


Functions And Their Graphs


11


0.4

0.2


20
15
10
5


-               I


4


2


4


-4       -2
                -5
                -10
                -15
                -20


2


4


Figure 2.1.7 Graph of y = sin(x), first with vertical window [-0.5, 0.5] and then [-20, 20]


                                          1.5 r


1


0.5


-4


2


1.5


Figure 2.1.8 Graph of y


sin(x) with vertical window [-1.5, 1.5]


Problems


1. In each of the following, x and y denote certain variable quantities. Discuss whether or
    not y is a function of x. If y is a function of x, can you write a formula that describes
    the relationship? Also, find the domain and range of each function.


(a)

(b)

(c)
(d)

(e)

(f)

(g)
(h)


x
x

x

x

x

x

x

x


speed of a train; y = distance the train travels in two hours

height above sea level; y = atmospheric pressure

time of the year; y = distance from the earth to the moon

temperature at the Great Falls airport; y = time of the day

length of the side of a square; y = area of the square

area of a circle; y = circumference of the circle

weight of a letter; y = first class postage for the letter

OPEC price for a barrel of oil; y = Dow Jones Industrial Average


﻿


12


Functions And Their Graphs


Section 2.1


2. A projectile was shot vertically into the air. The height h of the projectile was measured
   at 20 different times t. The following table gives the results, where t is in seconds and
   h is in meters.

     Time (t)     0.00  0.25  0.50  0.75   1.00  1.25  1.50  1.75  2.00  2.25   2.50
     Height (h) 0.00  11.6  22.1  31.2  39.2  46.0  51.5  55.7  58.8  60.6  61.3
     Time (t)     2.75  3.00  3.25  3.50   3.75  4.00  4.25  4.50  4.75  5.00
     Height (h) 60.6  58.8  55.7  51.5  46.0  39.2  31.2  22.1  11.6  0.00

   (a) Graph this data.
   (b) Graph this data with lines connecting successive points. Do you think this is a
       reasonable approximation to the graph of h as a function of t?
3. Identify the domain of each of the following functions.
   (a) f(x)=x2 - 6x                           (b) g(x)     c2-9
                                                               3
   (c) f (t) =ft2 + t-6                       (d) h(c) =
                           + t - 6            (d) h    x   2 +6x+ 8
                41                                          3
   (e) f (s) = s249                           (f) yJ(t) = 33t
              s-9                                        /3-t

   (g) z(s)         1(h) f (x) =
                s2+6s+1                           f      x(x2 + 1)
4. For each of the following pairs of functions, find f og(x), go f(x), f og(3), and f og(3),
   when they are defined.
   (a) f(x) = 4x + 12, g(x) = 5x - 2
   (b) f (x) =x2 - 12,g(X) =   cc
                                1
   (c) f(x) = 6x - x2, g(cc)=

5. (a) If the graph of f is a straight line with slope 3 and the graph of g is a straight line
       with slope 4, show that the graph of f o g is a straight line with slope 12.
   (b) If the graph of f is a straight line with slope m and the graph of g is a straight
       line with slope n, show that the graph of f o g is a straight line with slope mn.
6. Graph each of the following functions on the given interval.
   (a) f (x) = x2 + 6x + 1 on [-10, 5]
   (b) f(cc)=cc x cc4+c3 +cc2+cc+1lon [-5,5]
   (c) f(cc) =cc5 +8cc4+cc3 +cc2 +cc+1 on [-5, 51
   (d) f(cc) =cc5+ 8cc4+cc3 +cc2 +cc+1 on [-10, 10]
                1
   (e) g(t) =     2on [-10, 101
                t
   (f) f (t) =    2on [-10, 101
              1+ t


﻿


Section 2.1


Functions And Their Graphs


13


    (g) g(t) =    2 on [-10, 10]
               xc2 _-1
    (h) g(x) =X +ion [-5, 5]
               X4 +1
 7. Another problem in graphing using a computer or a calculator arises when the function
    in question is not defined at some of the points in the interval of interest. For example,
    try graphing
                                                 1
                                         f(x)=-

    on the interval [-5, 5]. There are several problems which may arise. First, if the
    computer program tries to evaluate f at x = 0, you may get an error message. In
    this case you may have to change the number of points being plotted so that x = 0 is
    missed. Second, if the program evaluates f for values of x very close to 0, the output
    from the function will be very large. The result might be that the vertical scale of your
    graphing window is much too large. Hence, you may wish to change the scale of the
    vertical axis. Another problem that may occur arises because the graph of f has two
    pieces; that is, the part of the graph to the left of the y-axis is not connected to the
    part of the graph to the right of the y-axis. If your graphing program simply connects
    points as it moves from left to right, it will connect points on opposite sides of the
    y-axis which should not be connected. This may be hard to avoid with some software,
    but you should be aware of the problem and, consequently, interpret your results with
    care.
 8. In light of the remarks in Problem 7, try graphing the following functions.
                 1
    (a) h(s) =   -182 on [-4, 4]
                 cc3
    (b) g(x) =      2 on [-4,4]

 9. Recall that Lc] denotes the largest integer less than or equal to x and Fcc] denotes the
    smallest integer greater than or equal to c. Let f(x) = Lc] and g(x) =I[x1.
    (a) What is the domain of f? What is the range of f?
    (b) What is the domain of g? What is the range of g?
    (c) Graph f and g on the interval [-5, 5].
    (d) Graph h(x) = L2] on the interval [-2, 2].
10. We say a function f is periodic if there exists a constant T such that f(t + T) f(t)
    for every value of t in the domain of f.

    (a) Is it possible for a polynomial to be periodic?
    (b) Are any of the functions in Problem 1 periodic?
    (c) Suppose cc represents the number of days since January 1, 1950, and y represents
        the amount of rainfall in Spokane on day cc. Do you think y is a periodic function
        of cc? If not, might it in some way be close to periodic?


﻿


Section 2.2


       Differential Equations         Trigonometric Functions


Many processes in nature are cyclic. A pendulum oscillates back and forth, repeating its
motion over and over; a weight hanging at the end of a spring bobs up and down; the Earth
repeats its orbit about the Sun every 365 days; a population of arctic wolves has periods
of growth followed by periods of decrease, following the fluctuations in the population of
their prey; the monthly rainfall at an agricultural research station varies cyclically over
the years and over the decades. To model such natural behavior, a mathematician needs
functions which repeat their values over intervals of fixed length. These functions are the
periodic functions. Precisely, a function f is periodic if there is a fixed constant T such
that f(t + T) = f(t) for every value of t in the domain of f. The smallest such positive T
for which this property holds is called the period of f.


                                        c          b


                                          a
                             Figure 2.2.1 A right triangle


    The class of periodic functions that we will consider in this section are the trigonometric
functions. Although these functions were originally invented to work with problems of
measurement, their importance in modern mathematics stems more from their periodic
behavior. We will begin with a definition in terms of measuring the sides of a right
triangle. Consider a right triangle with legs of lengths a and b and hypotenuse of length c.
Moreover, suppose, as in Figure 2.2.1, the angle opposite the leg of length b has measure
0. Then we define the sine of 0, which we write as sin(0), by

                                      sin(0) = b                               (2.2.1)
                                              C
and the cosine of 0, which we write as cos(0), by

                                     cos(0) =  .                               (2.2.2)
                                              C


1


Copyright @ by Dan Sloughter 2000


﻿


2


Trigonometric Functions


Section 2.2


                                             (0, 1)
                                                    (cos(e), sin(e))


                                    r 7 b e


                         (-1, 0)               a           (1, 0)


                                             (0, -1)

               Figure 2.2.2 A right triangle with a vertex on the unit circle


The properties of similar triangles, known even by the ancient Egyptians and Babylonians,
show that these ratios depend only on the value of 0, not on the size of the particular right
triangle being measured. Hence if we know the value of 0 and the length of just one side
of the triangle, and we have access to a table of values for the sine and cosine functions,
then it is possible to compute the lengths of the other two sides of the triangle. The
ancient Greek mathematicians exploited these facts in order to compute distances which
are inaccessible to direct measurement, such as the distance from the earth to the moon
and from the earth to the sun.
    Since the values of the sine and cosine functions do not depend on the size of any
particular right triangle, for the purpose of definitions we may restrict our attention to
right triangles with hypotenuses of length one. In particular, if we have a right triangle
with legs of lengths a and b and hypotenuse of length 1 (so that a2 + b2 = 1), then we may
draw it in the Cartesian plane with one leg running from (0, 0) to (a, 0) and the other from
(a, 0) to (a, b). If 0 is the measure of the angle opposite the side of length b, then we have

                                        sin(0) = b                                 (2.2.3)

and
                                       cos(0) = a.                                 (2.2.4)

In that case, the vertex (a, b) lies on the unit circle x2 +y2 = 1. In particular, (cos(0), sin(0)
is a point on the unit circle centered at the origin. This also gives us a method for measuring
angles. We will say that the measure of the angle opposite the side of length b is 0 radians
if the length of the arc of the unit circle from (a, 0) to (a, b) is 0. See Figure 2.2.2.
    So far our definitions of sine and cosine include only angles that are between 0 and i
radians. However, the considerations of the previous paragraph show us how to generalize
our definitions. Let t be any real number and let C be the unit circle centered at (0, 0).
If t ;> 0, let (a, b) be the point reached by traversing C a distance of t units in the
counterclockwise direction starting from (1, 0). If t < 0, let (a, b) be the point reached by
traversing C a distance of |t| units in the clockwise direction starting from (1, 0). Note


﻿


Section 2.2


Trigonometric Functions


3


that if t > 2w or t < -2w, then we will have to travel around C one or more times. We
now define the sine and cosine of t by

                                        sin(t) = b                                 (2.2.5)

and
                                        cos(t) = a.                                (2.2.6)
In this way we have sine and cosine defined as functions on the entire real line. That is,
both sine and cosine now have domain (-oc, oc).
    Our final definitions of the sine and cosine functions have several immediate conse-
quences. Most importantly, since the circumference of the unit circle is 2w, both functions
are periodic with period 27. Hence

                                   sin(t + 27) = sin(t)                            (2.2.7)

and
                                   cos(t + 27) =cos(t)                             (2.2.8)
for any value of t. Also, since (cos(t), sin(t)) is a point on the unit circle, we have

                                   sin2(t) + cos2(t)  1                            (2.2.9)

for all values of t. Recall that, in this notation,

                                    sin2 (t)  (sin(t))2

and
                                   cos2(t) (cos(t))2.
We will consider other interesting and useful identities involving sine and cosine in the
problems at the end of this section and later on as the need for them arises.
    Although numerical approximations of sin(t) and cos(t) are easily available from a
calculator for any value of t, it is useful to know some exact numerical values for these
functions. First of all, directly from the definition we have

  sin(0) = 0       sin (2ii= 1       sin(7) = 0        sin (37)     -1   sin(27) = 0

and

  cos(0) = 1        cos (2     0     cos(7)= -1       cos          0     cos(27) = 1.

Second, with a little more work, it can be shown that

                  sin(     ~        sin( ~       2    sin()       2

and

                  cos (-      2     cos (-       2    cos (-      -.


﻿


4


Trigonometric Functions


Section 2.2


b)


0)


                        Figure 2.2.3 Finding sin (6) and cos (6)


    Moreover, combining these values with basic knowledge of the geometry of the unit
circle, it is possible to find exact numerical values for the values of sin(t) and cos(t) for
    27r 37r 57r 77r 57r 47r 57r 77   n   "
  S3 ' 4 ' 6 ' 6' 4' 3       3   4 ,      6
Example      Let (a, b) be the point on the unit circle corresponding to the angle 6. Since
(a, b) is a distance 6 along the unit circle before (-1, 0), the point (-a, b) is a distance 6
along the unit circle after (1, 0). Hence


-a= cos (-6)


b =sin -6


3
2


1
2


and


Thus


    (57r
cos -6


a


2


and


sin (-)      b
     6 l


1
2


This is all best seen using a picture such as Figure 2.2.3. Note that the triangle with vertices
at (0, 0), (a, b), and (a, 0) is congruent to the triangle with vertices at (0, 0), (-a, b), and
(-a, 0).

    Of course, because both sine and cosine have period 27, it is also easy to find exact
values for sin(t) and cos(t) if t differs from one of the above values by a multiple of 27.


﻿


Section 2.2


Trigonometric Functions


5


3
2


3


2


IV,    w i      w


-6   -4


-6


4


6


-2

-3


-1

2

3


Figure 2.2.4 Graphs of y = sin(t) and y =cos(t)


Graphs of sine and cosine
The graphs of y = sin(t) and y =cos(t) are shown in Figures 2.2.4. Since both functions
have period 27, the graphs will continue this behavior as t goes to -oo or oo, completing
one oscillation over every interval of length 27.
    The only difference between the graph of y = a sin(t), where a > 0, and the graph of
y = sin(t) is that the former oscillates between -a and a instead of between -1to 1. In
general, for any constant a -f 0, the graph of y= a sin(t) oscillates between -lal and |al.
We call |al the amplitude of the function y = a sin(t). Of course, if a < 0, then the graph
of y = a sin(t) is the graph of y = |al sin(t) reflected about the t-axis.


Example


The graph of y = 2sin(t) is shown in Figure 2.2.5.


3 r


A.


2


1


A


-6     -4     -2


                   -2 -


2      4


-3 L


                           Figure 2.2.5 Graph of y = 2sin(t)


    Now consider the graph of the function y = sin(bt). Since the sine function has period
27, this function goes through one complete oscillation as t goes from 0 to   . That is,
y = sin(bt) has period b. Hence, if b > 0 the only difference between the graphs of
y = sin(t) and y = sin(bt) is the length of the period of oscillation. If b < 0, we may use


﻿


6


Trigonometric Functions


Section 2.2


3

2


6


-2

-3


Figure 2.2.6 Graph of y


sin(2t)


the fact that
                            sin(bt) =sin(-|b t)  - sin(|b t),

which follows from Problem 10.

Example The graph of y = sin(2t) is shown in Figure 2.2.6.

    Finally, consider the graph of y = sin(t - c). As mentioned in Section 2.1, the effect of
the c is to shift the graph y = sin(t) horizontally by |cl units, to the right if c > 0 and to
the left if c < 0. We call c the phase angle.

Example The graph of y = sin(t - r) is shown in Figure 2.2.7.


3

2


4      6


-1


-2

-3


Figure 2.2.7 Graph of y


sin(t - r)


   Summarizing the previous comments, the function y = a sin(b(t - c)) has amplitude
a, period  , and phase angle c.


﻿


Section 2.2


Trigonometric Functions


7


Figure 2.2.8 Graph of y = -3 sin(2t - r)


Example Consider the function f (t) = -3 sin(2t - r). If we write f (t) in the form

                               f (t) = -3 sin (2 (t -2)

then we see that f has amplitude 3, period r, and phase angle 2. The graph of f is shown
in Figure 2.2.8.
    Similar remarks hold for the graph of the function y = a cos(b(t-c)), the only difference
being that, since cos(-t) =cos(t) for all t (see Problem 10), we have

                                   cos(bt) = cos(|b t)

even when b < 0.

Related functions
The other four trigonometric functions are defined in terms of the sine and cosine functions.
The tangent function is defined by

                                             sin(t)
                                    tan(t) =n      .                            (2.2.10)
                                             cos(t)

Note that tan(t) is the slope of the line from (0, 0) to (cos(t), sin(t)). The graph of y
tan(t) has vertical asymptotes at every value of t for which cos(t) = 0, as can be seen in
Figure 2.2.9.
    The cotangent function is the reciprocal of the tangent function; namely,
                                           1      cos(t)
                                cot(t)  tant)      s(t)                         (2.2.11)
                                        tan(t) sin (t)
Finally, the secant and cosecant functions are the reciprocals of the cosine and sine func-
tions, respectively. Hence
                                                1
                                     sec(t) =    t(2.2.12)
                                             cos(t)


﻿


8


Trigonometric Functions


Section 2.2


4


Figure 2.2.9 Graph of y = tan(t)


and


           1
csc(t) =_.1
        sin(t)


(2.2.13)


As with the tangent function, the graph of y = sec(t) has vertical asymptotes at all points
t where cos(t) = 0, as seen in Figure 2.2.10.
    Clearly both the secant and cosecant functions have period 27. However, the tangent
and cotangent functions both have period r. We will leave that fact, along with the graphs
of the cotangent and cosecant functions, to the problems at the end of this section.


                           Figure 2.2.10 Graph of y =sec(t)


Periodic motion
As mentioned earlier, many natural phenomena change in a periodic fashion. For example,
suppose we have a pendulum and for a given time t we let x(t) represent the angle between
the current position of the pendulum and its rest position, taking x to be positive if the


﻿


Section 2.2


Trigonometric Functions


9


| x(t)


                              Figure 2.2.11 A pendulum


pendulum is to the right of its rest position and negative otherwise (see Figure 2.2.11). If
initially the pendulum is held at a small angle a > 0 and then released, that is, x(0) = a,
then, if we ignore friction, it can be shown that

                                 x(t) = acos    bt)                           (2.2.14)


where g is the acceleration due to gravity (32 feet per second per second or 9 .8 meters per
second per second) and b is the length of the pendulum. Actually, this is an approximation
which holds very well for small values of a. Note that the period of x, namely,

                                    27          b
                                        = 27 -,
                                               9
                                     b

does not depend upon the amplitude a. This is an important fact, supposedly first noticed
by Galileo, which is crucial in the operation of pendulum clocks. We will consider this
problem more closely in Chapter 8, where we will derive (2.2.14) and see exactly how the
approximation enters the picture.
    Periodic motions need not always be as simple as the motion of a pendulum. Consider,
for example, the motion of a molecule of air as a sound wave passes. The action of the
sound wave causes a particular molecule of air to oscillate back and forth about some
equilibrium position. If we let x(t) represent the position of the air molecule at time t,
with x = 0 corresponding to the equilibrium position and x considered to be positive in one
direction from the equilibrium position and negative in the other, then for many sounds
x will be a periodic function of t. In general, this will be true for musical sounds, but
not true for sounds we would normally classify as noise. Moreover, even if x is a periodic
function, it need not be simply a sine or cosine function. The graph of x for a musical
sound, although periodic, may be very complicated. However, many simple sounds, such
as the sound of a tuning fork, are represented by sine curves. For example, if x is the
displacement of an air molecule for a tuning fork which vibrates at 440 cycles per second
with a maximum displacement from equilibrium of 0.002 centimeters, then


x(t) = 0.002 sin(8807t).


﻿


10


Trigonometric Functions


Section 2.2


                                        200
                                        150
                                        100
                                          0

                          10      -5                5        1
                                        -50
                                        100
                                        -150
                                        -200
          Figure 2.2.12 Graph of the air displacement due to the organ note C3


Notice that this function has period 880, =  1, and hence has a frequency of 440 cycles
per second.
    In the early part of the 19th century, Joseph Fourier (1768-1830) showed that the story
does not end here. Fourier demonstrated that any "nice" periodic curve (for example, one
which is connected) can be approximated as closely as desired by a sum of sine and cosine
functions. In particular, this means that for any musical sound the function x may be
approximated well by a sum of sine and cosine functions. For example, in his book The
Science of Musical Sounds (Macmillan, New York, 1926), Dayton Miller shows that, with
an appropriate choice of units,

    x(t) = 22.4 sin(t) + 94.1 cos(t) + 49.8 sin(2t) - 43.6 cos(2t) + 33.7 sin(3t)
             - 14.2 cos(3t) + 19.0 sin(4t) - 1.9 cos(4t) + 8.9 sin(5t) - 5.22 cos(5t)
             - 8.18 sin(6t) - 1.77 cos(6t) + 6.40 sin(7t) - 0.54 cos(7t) + 3.11sin(8t)
             - 8.34 cos(8t) - 1.28 sin(9t) - 4.10 cos(9t) - 0.71 sin(10t) - 2.17 cos(10t)

gives a very good approximation to the displacement curve of a sound wave generated by
the tone C3 of an organ pipe. From the graph of x, shown in Figure 2.2.12, we can see its
complexity as well as its periodicity. Notice that the terms in this expression for x(t) are
written in pairs with frequencies which are always integer multiples of the frequency of the
first pair. This is a general fact which is part of Fourier's theory; if we added more terms to
obtain more accuracy, the next terms would be of the form a sin(llt) + b cos(11t) for some
constants a and b. Notice also that the amplitudes of the sine and cosine curves tend to
decrease as the frequencies are increasing. As a consequence, the higher frequencies have
less impact on the total curve. Put another way, Fourier's theorem says that every musical
sound is the sum of simple tones which could be generated by tuning forks. Hence in theory,
although certainly not in practice, the instruments of any orchestra could all be replaced
by tuning forks. On a more practical level, Fourier's analysis of periodic functions has
been fundamental for the development of such modern conveniences as radios, televisions,
stereos, and compact disc players. Unfortunately, this is a story which will have to be told
elsewhere.


﻿


Section 2.2


Trigonometric Functions


11


Problems

1. Find the exact values of sin(t), cos(t), tan(t), and sec(t) for the following values of t.
       4wr                                  7wr
   (a) 34(b) 6
                 36
       277r
   (c) 3                                 (d) --
        2w                                  21w
   (e)  3                                    4(f)
       11w                                    11w
   (g)  6                                (h)   6
        66
 2. Sketch the graph of each of the following functions over an interval that contains at
   least one period of the function both to the right and to the left of the vertical axis.
   Also, identify the amplitude, period, and phase angle of each curve.
   (a) y = sin(3t)                       (b) y = 3 cos(2t)
   (c) y =cos(t - w)                     (d) x = sin(2t) + 1
   (e) x = 4sin(7t)                      (f) y = -2 cos(2t -w7)
   (g) x = 5 sin(2t + w)                 (h) y = -3 sin(27t)
 3. Starting with the identity sin2(x) + cos2(x) = 1, explain why

                              1 + tan2 (X) = sec2 (X).

 4. The addition formulas for sine and cosine are

                      sin(x + y) = sin(x) cos(y) + cos(x) sin(y)

   and
                      cos (x + y) = cos(x) cos(y) - sin(x) sin(y).
   Use these to derive the double-angle formulas:
   (a) sin(2x) = 2 sin(x) cos(x)         (b) cos(2x) =cos2(x) - sin2(x)
 5. Use the double-angle formulas of Problem 4 to derive the half-angle formulas:

   (a) cos2(x) = 1 + cos(2x)             (b) sin2(x) = 1 - cos(2)
                    2                                    2
 6. Use the addition formulas of Problem 4 to derive the shift formulas:

   (a) sin (c-) = -cos(c)                (b) cos (c-) =sin(c)

   (c) sin (cc+) =cos(c)                 (d) cos (cc+) = -sin(c)

 7. Can you picture the identities of Problem 6 in terms of the definitions of sine and cosine
   using the unit circle? What do these identities say about the relationship between the
   graphs of sine and cosine?


﻿


12


Trigonometric Functions


Section 2.2


8. Using the addition formulas of Problem 4, show that the tangent and cotangent func-
    tions have period r. That is, show that

                                   tan(t + r) = tan(t)

    and
                                    cot (t + r) = cot(t)

    for all values of t.

 9. Graph each of the following functions.

    (a) y = tan(2t)                           (b) y = cot(t)

    (c) y = tan(2)                            (d) y = csc(t)

    (e) x = sec(2t)                           (f) y = tan(4t) + 3
10. Using the definitions of sine and cosine, convince yourself that

                                   sin(-x)    - sin(x)

    and
                                    cos(-x) = cos(x)

    for all values of x. Now sketch the graphs of y =sin(-3x) and y = cos(7 - x).
11. According to Dayton Miller in The Science of Musical Sounds, the function

         x(t) = 151 sin(t) - 67 cos(t) + 24 sin(2t) + 55 cos(2t) + 27 sin(3t) + 5cos(3t)

    gives a good approximation to the shape of the displacement curve for the tone B4
    played on the E string of a violin.
    (a) Graph each of the individual terms of x on the interval [-15, 15]. Use a common
       scale for the vertical axis.
    (b) Graph x on [-15, 15].
    (c) Graph x and its individual terms (a total of 7 graphs) together on the interval
        [-15,15].
12. Suppose we define a function f by saying that it is periodic with period 1 and that
    f(xc) =1 - 2cc for 0 <; cc < 1.
    (a) Sketch the graph of f over the interval [-3, 31.
    (b) Let

                         11                      1                   1.
          g (cc) - 2  -sin(27rc) +  -sin(47rc) +  -sin(67rc) + - - + -sin (2rc))
                         x2w                     3w                 r


﻿


Section 2.2


Trigonometric Functions


13


       for n = 1, 2, 3,.... For example,


                                    gi(x) £= -sin(27x),

                                      2            1
                              g2(x) =- sin(27x) + - sin(47x),
                                      7r           7r
       and
                       ga (x) = - sin(27x) + - sin(47x) + - sin(67x).
                               7            7            37
       What is the period of gn? Graph gi, 92, g3, g4, g5, and gio over the interval [-3, 3].
    (c) What do you think happens to gn as n gets large?
13. Graph f(x) L=[sin(x)] on the interval [-rr].
14. For an interesting account of sound waves, Fourier's theorem, and related ideas in
    electromagnetism, read Chapters 19 ("The Sine of G Major") and 20 ("Mastery of the
    Ether Waves") in Morris Kline's Mathematics in Western Culture (Oxford University
    Press, 1953).


﻿


Section 2.3


                to                    Limits And The Notion
       Differential Equations         Of Continuity


Of particular interest in mathematics and its applications to the physical world are func-
tional relationships in which the dependent variable changes continuously with changes in
the independent variable. Intuitively, changing continuously means that small changes in
the independent variable do not produce abrupt changes in the dependent variable. For
example, a small change in the radius of a circle does not produce an abrupt change in
the area of the circle; we would say that the area of the circle changes continuously with
the radius of the circle. Similarly, a small change in the height from which some object
is dropped will result in a related small change in the object's terminal velocity; hence
terminal velocity is a continuous function of height. On the other hand, when an electri-
cal switch is closed, there is an abrupt change in the current flowing through the circuit;
the current flow through the circuit is not a continuous function of time. The purpose
of this section is to introduce the terminology and concepts that will give us a proper
mathematical basis for discussing continuity in the next section.
    To begin our study of continuity, we will first look at two examples of functions which
are not continuous. In this way we will discover what properties to exclude when forming
our definition of a continuous function.
Example Consider the function H defined by

                                       i(t) =     0 .                          (2.3.1)
                                         1, if t;>0.
This function, known as the Heaviside function, might be used in connection with modeling
the current passing through a switch which is open until time t = 0 and then closed. The
graph of this function consists of two horizontal half-lines with a vertical gap of unit length
at the origin, as shown in Figure 2.3.1. Since this function has a break in its graph at 0,
its output changes abruptly as t passes from negative values to positive values. In fact, if
t < 0, H(t) = 0 no matter how close t is to 0, whereas if t > 0, H(t) = 1 no matter how
close t is to 0. Hence, near 0, small changes in t may result in sudden changes in H(t).
We say that H has a discontinuity at t = 0.
    In this section we will develop the language and notation necessary to describe this
situation mathematically. In particular, note that for any sequence {tn} with to < 0 for
all n and lim to = 0, we have
         n12 -*0
                                    lim H(tn) = 0

since H(tn) = 0 for all n. We say that the limit of H(t) as t approaches 0 from the left is
0, which we denote by
                                    lim H(t) = 0.
                                    t->0--


1


Copyright @ by Dan Sloughter 2000


﻿


2


Limits And The Notion Of Continuity


Section 2.3


                                           2

                                           1.5


                                         0.5


                        -4        -2                  2         4
                                        -0.5


                      Figure 2.3.1 Graph of the Heaviside function


However, for any sequence {ts} with t > 0 for all n and limn t= 0, we have

                                      lim H(tn) = 1

since H(tn) = 1 for all n. We say that the limit of H(t) as t approaches 0 from the right
is 1, which we denote by
                                      lim H(t) = 1.
                                      t-o+
Since these two limiting values do not agree, we say that H(t) does not have a limit as
t approaches 0. Hence, in this case, the discontinuity of H at 0 is characterized by the
absence of a limiting value for H(t) at 0. In the next section we will make the existence
of a limiting value one of the criteria for a function to be continuous at a point.
Example Now consider the function

                                          -x, ifx <0,
                                g(x) =    1,   if x=0,
                                         x,     ifx >0.

As with the previous example, this function does not change continuously as x passes from
negative to positive values. However, the discontinuity arises in a different manner. Note
that if {xn} is a sequence with xn < 0 for all n and lim xn = 0, then

                        lim g(x) =lim (-   ) =- lim o= 0.

Thus
                                      lim g(x)= 0.

Similarly, if {xn} is a sequence with xn > 0 for all ft and lim xn 0, then


lim g(xn) = lim xn = 0.
n2-o00        n-o0


﻿


Section 2.3


Limits And The Notion Of Continuity


3


                                          4

                                          3

                                          2


                   -6      -4      -2               2       4       6
                            Figure 2.3.2 Graph of y = g(x)


Thus
                                     lim g(x) = 0.
                                     x-o+
Hence in this case g(x) does have a limiting value as x approaches 0 and we can write

                                     lim g(x) = 0.
                                     x-o

However, there is still an sudden change in the value of the function at 0 because g(0) = 1,
not 0. Graphically, this shows up as a hole in the graph of g at the origin, as shown
in Figure 2.3.2. Thus the abrupt change in values of g(x) results not from the lack of a
limiting value as x approaches 0, but rather from the fact that

                               g(0)= 1 # 0= lim g(x).
                                              x-o

This illustrates another type of behavior that we will have to exclude in our definition of
continuity.
    These examples illustrate two ways in which a function may fail to be continuous. The
definition which we will discuss in Section 2.4 essentially says that a function is continuous
if it does not have either of the problems that we have seen with H and g. However,
before pursuing this question further, we must first introduce the notion of a limit for a
function defined on an interval of real numbers. We have already seen the pattern for this
definition in the previous examples. Namely, in order to define, for some function f, the
limit of f(x) as x approaches some number c, we consider sequences {xn} that converge
to c and ask if the sequence {f(xn)} has a limit. Hence we reduce our new question to the
old problem of limits of sequences that we considered back in Section 1.2. However, we
must be careful about two points. First, there will always be more than one sequence {xn}
which converges to a given point c. As we saw in the examples, in order to understand
the behavior of a function near c, we must take into account how the function behaves on
all possible sequences that converge to c. Second, we want the limit to describe what is


﻿


4


Limits And The Notion Of Continuity


Section 2.3


happening to the function for values of x close to c, but not equal to c. Thus we must
restrict the sequences {xn} to those for which xn # c for all values of n. With these ideas
in mind, we now have the following definition.
Definition Let I be an open interval and let c be a point in I. Let J be the set consisting
of all points of I except the point c; that is, J = {x | x is in I, x # c}. Suppose J is in the
domain of the function f. We say the limit of f(x) as x approaches c is L, denoted

                                     lim f (x) = L,                               (2.3.2)
                                     x-c

if for every sequence {xn} of points in J we have

                                     lim f (xn) = L                               (2.3.3)

whenever
                                       lim xn = c.                                (2.3.4)

    In other words, to determine the value of lim f(x), we ask for the limit of the sequence
                                            x-c
{f(xn)}, where {xn} is any sequence in J which is approaching c. If {f(xn)} approaches
L for all such sequences , then L is the limit of f(x) as x approaches c.
    We define one-sided limits in a similar fashion. Namely, if J is an open interval of the
form (c, b) in the domain of f, then we say the limit of f(x) as x approaches c from the
right is L, denoted
                                     lim f (x) = L,                               (2.3.5)
                                     x-C+
if for every sequence {xn} of points in J we have

                                     lim f (xn) = L                               (2.3.6)

whenever
                                       lim xn = c.                                (2.3.7)

Similarly, if J is an open interval of the form (a, c) in the domain of f, then we say the
limit of f (x) as x approaches c from the left is L, denoted

                                     lim_ f(x) = L,                               (2.3.8)
                                     x- C

if for every sequence {xn} of points in J we have


whenever


lm   n = c.


(2.3.10)


﻿


Section 2.3


Limits And The Notion Of Continuity


5


Note that the existence of a one-sided limit only requires that the limiting value of {f(xx)}
be the same for all sequences {xn} which approach c from the same side, whereas the
existence of a limit requires that the limiting value of be the same for all sequences {xn}
which approach c. In particular, this means that if

                                      lim f (x) = L
                                      x-C


then we must have both


and


lim f (x)  L
x-C+

lim f (x) = L.
x-- C


Not surprisingly, this also works in the other direction; in general, we have

       lim f(x) = L if and only if both lim f(x) = L and  lim f(x) = L.
       x--C                             x-C+                xC-


(2.3.11)


    Since the above definitions are all in terms of limits of sequences, we may use all the
properties of limits of sequences developed in Section 1.2 when discussing the limit of a
function defined on an interval of real numbers.


Example      Consider the constant function f(x) = 2 for all x. To compute,
lim f (x), we need to compute lim f(xn) for an arbitrary sequence {xn} with
For such a sequence, we have


for example,
lim xn = 3.
I2-- 00


lim  f(xn)
n2-o 0


lim 2 = 2.
I2-- 00


Hence
                                      lim f (x) = 2.
                                      x- 3
In fact, it should be easy to see that for any value of c

                                  lim f (x) = lim 2 = 2,
                                  x-C        x-C

and, more generally, for any constant k,

                                       lim k =k.
                                       x-C


Example   Suppose f(x)
lim on = 5. Then
I2-o 0


x. To find lim f(x), first let {xn} be any sequence with


lim  f(xn)
n2-o 0


lim xn = 5,
n2-o 0


so


lim f (x) = 5.


﻿


6                  Limits And The Notion Of Continuity                   Section 2.3

In fact, we could replace 5 by an arbitrary c in this computation and obtain the general
result that
                                     lim x = c.                              (2.3.12)
                                     x-c

Example To find lim x2, let {x} be any sequence with lim cn = 6. Then
                   xl6         =-(=                        0
                                                     2
                    1-0           -00      (ilim f(cc) =lm (l i ccn =62 =36.

Thus
                                    lim x2 = 36.
                                    x-6
Again we can generalize this statement by replacing 6 by an arbitrary c, in which case we
have
                                    lim x2 =c2.
                                    x-c
Moreover, we may replace the power 2 by any rational number p for which cP and cP are
defined and have
                                    lim zp = cp.                             (2.3.13)
                                    x-c

Example     Let f(x) = 4x3 - 6x2 + x - 7. To find limf(x), let {x} be any sequence
                                                  x 2
with lim cn = 2. Then

                lim f (xn) = lim (4xn - 6xn + cn - 7)
                12-o00      n-o0

                          = 4   lim cnc  -6    lim cn   + lim n - 7
                              m-oo    /      \n-o00 /     n-o00
                            (4)(23) -(6)(22) + 2 - 7 = 3.

Hence
                                    lim f (x) = 3,
                                    x-2
which is just f(2).
Example Now let f be an arbitrary polynomial, say,

                  f(x) = amcm +am-icm-1 +---+a2x2 +a1xc+ao
for some constants a0, ai, a2,... ,am. If {cc} is any sequence with lima00 ccn - c, then

        lim f(ccn)= lim (amccT + am-ic2- + - - + a2cci + aicc + ao)

                  = am (limcco)    + am-1 (lim cx on- + ...+ a2 (limcco)

                        ai (lim  /n +a
                  = amc" + am-icm- + ---+a2c2 + aic +ao
                  = f(c).


﻿


Section 2.3                  Limits And The Notion Of Continuity                   7

Hence, for any polynomial f and any real number c,

                                   lim f(x) = f(c).
                                   x-c

    The previous example is important enough to state as a proposition.
Proposition If f is a polynomial and c is any real number, then

                                   lim f(x) = f(c).                           (2.3.14)
                                   x-c

    If we combine this result with our result about the limits of quotients in Section 1.2,
we have the following proposition.
Proposition  If f and g are both polynomials and c is any real number for which g(c) # 0,
then
                                  lim f(x)    f(c)                            (2.3.15)
                                  x -c g(x)   g(c)

    In short, if h is a rational function and h is defined at c, then the value of the limit of
h(x) as x approaches c is simply the value of h at c. That is,

                                   lim h(x) = h(c)                            (2.3.16)
                                   x-c

for any rational function h which is defined at c.
Example Using our result about polynomials, we have

                    lim (3x4 - 6x + 12) = (3)(24) - (6)(2) + 12 = 48.
                    x-n2

Example Using our result about rational functions, we have

                     .     3x + 4            (3)(3)+4         13
                     x-a2x2 + 2x - 1  (2)(32) + (2)(3) - 1  23

Example Now consider
                                          x2_4
                                     lim.
                                     x-n2 x - 2
Note that our result about the limits of rational functions does not apply here since the
denominator is 0 at x =2. However, since the numerator is also 0 at x =2, the numerator
and the denominator must have a common factor of x - 2. Canceling this common factor
will simplify the problem and enable us to evaluate the limit. That is,

                    lim       = ~lim(   )=+2        lim (x +2) =4.
                  x-n2 x -2    x-n2     x -2        x-n2


﻿


8


Limits And The Notion Of Continuity


Section 2.3


6

5

4

3

2

1


I          i          i         i          i


1


2


3


4


                          Figure 2.3.3 Graph of f(x)=x-4


Although technical, it is worth noting that the functions
                                             x2_4
                                             x - 42
                                             xc-2

and
                                     g(x) = x + 2
are different functions. In particular, f is not defined at x = 2, whereas g is. However, for
every point x 5 2, f (x) = g(x). As a result,

                                  lim f(x) = lim g(x),

since the limits depend only on the values of f and g for points close to, but not equal to,
2. See the graph of f in Figure 2.3.3.
Example As another example of the technique used in the previous example, we have


lim     -
t--1 t3 + 1


.i      (t + 1)(t - 1)
t--1 (t + 1) (t2 - t + 1)
        t-1
 lim
 t--1 t2 - t + 1
 -1-1         2
 1+1+1        3


    In the last two examples, we have used the algebraic fact that if c is a root of a
polynomial f(x), then x - c is a factor of f(x). In particular, this means that if both
the numerator and the denominator of a rational function are 0 at x = c, then they have
a common factor of x - c. However, if the numerator is not 0 at c, but the limit of the
denominator is 0, then the limit will not exist. For example, if

                                              1
                                          f x) 2'


﻿


Section 2.3


Limits And The Notion Of Continuity


9


then lim f(x) does not exist, since dividing 1 by oc, where {xn} is a sequence with
      x-O
 lim xn = 0, will always result in a sequence of positive numbers which are growing
without bound. Borrowing from the notation we developed in Section 1.2, we may write

                                           1
                                      lim  2=   0.

As before, we must be careful to remember that this notation means that, although the
function does not have a limit as x approaches 0, the value of the function grows without
any bound as x approaches 0. Similarly, since

                                         1
                                         ->0
                                         x


when x > 0, we have


      1
 lim -
x-0+ x


and, since


when x < 0,


However, since


1
-<0
I


      1
 lim -
x0-- X


-co.


                                               1

behaves differently as x approaches 0 from the right than it does when x approaches 0
from the left, all we can say about the limit of f(x) as x approaches 0 is that it does not
exist.
    Graphically, for a given function f,

                                     lim f (x) = o0
                                     x-c


or


lim f(x)
x-- c


O


tells us that the graph of f will approach the vertical line x = c asymptotically as x
approaches c from the left. The graph will go off along x = c in the positive direction in
the first case and in the negative direction in the second case. Similar remarks hold for x
approaching c from the right when


lim f(x)


lim f (x)


or


-00.


﻿


10


Limits And The Notion Of Continuity


Section 2.3


20


Figure 2.3.4 Graph of y =2x


Example Since 2 - x > 0 when x < 2 and 2 - x < 0 when x > 2, it follows that


  lim    x
  x-2 2 - x


  lim 2
x-2+ 2 - x


-c0.


and


It follows that the line x = 2 is a vertical asymptote for the graph of


                                       y=2-c'

with the curve going off in the positive direction from the left and in the negative direction
from the right. See Figure 2.3.4
    The next examples illustrate the use of one-sided limits, using (2.3.8), in determining
the existence of certain limits.


Example Suppose


g(z)     z2+1,
  g~z)   3z + 4,


if z<1,
z>1.


Then, since g(z) = z2 + 1 when z < 1,


lim g(z) = lim (z2 + 1)
z-1-        z--


2


and, since g(z) = 3z + 4 when z > 1,


lim g(z) = lim (3z + 4):
z-1- +      z-1- +


7.


Since these limits are not the same, we know from (2.3.11) that g(z) does not have a
limiting value as z approaches 1. Graphically, we see this as a break in the graph of g at


﻿


Section 2.3


Limits And The Notion Of Continuity


11


                                 14
                                 12
                                 10
                                 8
                                 6
                                 4
                                 2

                   2      -1               1       2       34

                            Figure 2.3.5 Graph of y = g(z)


z = 1, as shown in Figure 2.3.5. Note, however, that for any c # 1, lim g(z) = g(c). For
example
                             lim g(z) = lim (z2 + 1) = 17.
                             za-4       za_-4

Example Now consider
                                     ft2t+3,    if t <;2,
                                h~)=
                                      2t2 -1, ift>2.
Then
                              lim h(t) =lim (2t + 3) = 7
                              t---2--    t 2--
and
                             lim h(t) = lim (2t2 - 1)   7.
                             t-2+       t-2+
In this case both one-sided limits are equal to 7, so we have, using (2.3.11),

                                     lim h(t) =7.
                                     t- 2

Graphically, the graph of h does not have a break at t = 2, even though the formula for
computing h(t) changes at this point. See Figure 2.3.6.
    We may also use limits to inquire into the behavior of the values of a function f as x
increases, or decreases, without bound. This leads to the following definition.
Definition    Suppose f is a function defined on an interval J of the form (a, oo). We say
that the limit of f(x) as x approaches oc is L, denoted

                                     lim f (x) =L,                             (2.3.17)

if for every sequence {xn } in J we have


lim f (xn)   L
n-oo


(2.3.18)


﻿


12


Limits And The Notion Of Continuity

     25 e


Section 2.3


20

15

10


5


1


1       2       3        4       5


Figure 2.3.6 Graph of y =


h(t)


whenever


lim   n = 00.
n2-o 0


(2.3.19)


Similarly, suppose f is a function defined on an interval J of the form (-oc, b). We say
that the limit of f(x) as x approaches -oc is L, denoted

                                     lim  f (x) = L,                            (2.3.20)

if for every sequence {x } in J we have

                                     lim f (xn) = L                             (2.3.21)

whenever


lim xn
n2-o 0


-c.


(2.3.22)


Example


Suppose {xn} is a sequence and lim xn
                                n2-o 0


oc. Given c> 0, there must exist


an integer N such that


whenever n> N. Hence


whenever n> N. That is,


     1
     E

1
   <n


lim -
n2-oo X


0.


Since this true for any such sequence {xn}, it follows that

                                           1
                                       lim - = 0.
                                       n2-o00


﻿


Section 2.3


Limits And The Notion Of Continuity


13


    In a similar fashion, we may show that

                                           1
                                      lim - = 0.

With these two basic limits, it is possible to compute limits of these types for any rational
function using the same techniques we used in Section 1.2. Namely, given a rational
function, dividing numerator and denominator by the highest power appearing in the
denominator simplifies the expression to a form where the limit may be evaluated easily.


Example


Example


     3x2 +4x-6
 lim 3w+w
x-o2x2-6x+2


     3x2 -6x
 lim
x-o 4x3 + 2x


         4 6
     3+--2
 lim  x
xo 2 6  2
         .Tw x2


3
2


    3    6
       S2
im       2
    4+2


0
- = 0.
4


Example We have


lim   4x3-3
x--oo 2x2 + 6


           3
     4x - 2
lim       x
      2+
          .T2


.O ,


since the denominator is approaching 2 while the numerator decreases without bound as
x goes to -oc. Note that, as usual, although the limit does not exist, we make use of this
notation to indicate the manner in which the limit fails to exist.

    Graphically, lim f(x) = L tells us that the graph of y = f(x) approaches the hori-
zontal line y = L asymptotically as x increases without bound. Similarly, lim  f(x) = L
                                                                       x-00
tells us that the graph of y = f(x) approaches the horizontal line y = L asymptotically as
x decreases without bound.
Example Since
                                                  1


lim    w
x-o w2+ 1


and


lim  x
     1+
         .T2

         1
 lim          = 0
x-ol 1
        .TX2


lim  w
x--o x2+1


we know that the line y = 0, that is, the x-axis, is a horizontal asymptote for the graph of
                                           2
                                     Y~21


﻿


14


Limits And The Notion Of Continuity


Section 2.3


0.6

0.4

0.2


10          20


-0.6


Figure 2.3.7 Graph of y


  x
x2 +1


Moreover, since


for x> 0 and


c2+1>0


T


                                       X2 + 1 <0
for x <0, we know that the approach to the x-axis is from above as x increases and from
below as x decreases. See Figure 2.3.7.
    The following proposition summarizes the basic properties of limits. These are essen-
tially restatements of the properties of limits of sequences that we discussed in Section 1.2.
The fact that they hold here follows from the way we have defined limits in this section in
terms of limits of sequences. Moreover, the properties listed in this proposition also hold
for one-sided limits.
Proposition     Suppose lim f (x) = L and lim g(x) = M, where L and M are real numbers
                        xnc xca, -c
and c is a real number, oc, or -oc. Then


lim kf(x) = kL for any constant k,
x---c


lim (f(x) +g(c)) = L + M,
x--- c
lim (f (x) - g(x)) = L - M,
x-m (C
   lim (f(c)g(cc)) =LM,
   x--- c


(2.3.23)

(2.3.24)

(2.3.25)

(2.3.26)


(2.3.27)


lim f(x)
x- c g(c)


L
M'


and, provided p is a rational number for which (f(x))P and LP are defined,


lim (f (x))p = LP.
x--- c


(2.3.28)


﻿


Section 2.3

Example


                Limits And The Notion Of Continuity

Using (2.3.28), we have


15


lim   x2 +3
x-4


lim (x2 + 3)
x- 4


19.


Example


Using the fact hat

                 X 2 ICc-{


we have


lim  4c
x-o oI-/2 + 1


-x,  ifcx<0,
r, if x > 0,

         4
 lim  4
 x-- ° x/2 + 1


 lim  4
 xo- oc x/2 + 1
         cc2


 l4
      c2 x+1
        1 +

 4


and


lim      4c
x--occ2 + 1


lim  4
x -o  x2 + 1


lim  4
x -oo   x2 + 1
        - cc2

 lim       4
 x--oo c2 +  1
            .T2

 lim       4
x--o 1


-4.


Hence the lines y


4 and y


-4 are both horizontal asymptotes for the graph of
            4x
            2/+x 1


See Figure 2.3.8.


﻿


16


Limits And The Notion Of Continuity


Section 2.3


4


2


-10         -5


5           10


Figure 2.3.8 Graph of y = 4x
                            x2+ 1


Problems

1. Evaluate the following limits.
    (a) lim (4x2 - 3x)
        x--- 2

    (c) lim   -
        t-1 t+5
 2. Evaluate the following limits.

    (a) limx     8
        x- 2 x + 2

    (c) lim  s+1
        s--1 s4 - 1

    (e) lim   - 8
        t- 2 t -2
    (g) lim       )
          -4(u - 4)2
 3. Evaluate the following limits.

    (a) lim (3x2+4)
        x-1+
               1
    (c) lim
        x-3+ X - 3

    (e)  lim    t
        t--2-  t + 2
 4. Evaluate the following limits.
    (a) lim Lx]
        x-(c  2+
    (c) lim Fwl


(b) lim (x3
    x---3


2x+3)


(d) lim   z+2
    z--2 z2 +3


(b) lim w-x6
    x-3  x -3

(d) lim
           -1
(f) lim 2-
    x-1x2 _1x
(h) lim
    x--2wX+2


(b) lim  1
    x-3- w - 3
(d) lim
    t--2+ t + 2

(f) limxw9x10
    x--1- x2 - 8x - 20


(b)

(d)


lim Lw]

lim FIw
x- 3+


﻿


Section 2.3


Limits And The Notion Of Continuity


17


   (e)  lim Lcos(x)i

5. Suppose


(f   lim rsin(x)1
    x-o+


         3z
J z  - 7-


-1, ifz<
z,   if z>


2,
2.


   (a) Sketch the graph of g.
   (b) Find lim g(z).
            z -2 -
   (c) Find lim g(z).
            z- 2 +
   (d) Does lim g(z) exist? If so, what
            z--2
6. Suppose


is its value?


h(t) =   2t +1,
         3t - 1,


ift< 1,
ift> 1.


   (a) Sketch the graph of h.
   (b) Find lim h(t).
            t-1--
   (c) Find lim h(t).
            t- 1+
   (d) Does lim h(t) exist? If so, what is its value?
            t-1
7. Evaluate the following limits.


   (a) lim (3x + 4)
       x- 00
     (c)    u4+3u-6
   (c) lim
       U-o00 3u2 + 1
       .e     x5 -6x+ 13
   (e)  lim
       x--ox2+18x- 25

   (g) lim    2v+1
       v-o     v-2

   (i)  lim  3c+1
       x-o0 v4x2 + 5
8. Let


         (b) lim   c3+3c-1
             x-o2x3 -c2 + 21

         (d) lim 4z2 - 3z + 10
             z-Oo 2z) + 14z + 9
                .2 +3
         (f) lim 2 cc+3
             X-o0   cc+2

         (h) lm
             t-o t+3

         (j) lim     3cc+1
             x-4-o   4c2 + 5

         2cc
f(cc) =c-.


Find lim f(x), lim
the graph of f.


f(x), lim   f(x), and lim f(x). Use this information to sketch
      x -co 0x-~0


9. Discuss   lim  tan(x), lim   tan(x), and lim tan(x).

10. Do lim sin(7rc) and lim sin(7rn) denote the same thing? Discuss.
       x-o00             n-o00


﻿


18                 Limits And The Notion Of Continuity                  Section 2.3

11. (a) Explain why
                                       1   sin(x)<1

       for all x > 0.
                               sin(x)
    (b) Use part (a) to find lm


﻿


Section 2.4


       Differential Equations          Continuous Functions


Given the work of the previous section, we are now in a position to state a clear definition
of the notion of continuity. We will have several related definitions, but the fundamental
definition is that of continuity at a point. Intuitively, continuity at a point c for a function
f means that the values of f for points near c do not change abruptly from the value of f
at c. Section 2.3 has shown that, mathematically, this means that as x approaches c, the
value of f(x) must be approaching f(c). Hence we have the following basic definition.
Definition We say that a function f is continuous at a point c if

                                   lim f(x) = f(c).                             (2.4.1)
                                   x-->C

    It is important to note that this definition places three conditions on the behavior of
the function f near the point c. Namely, f is continuous at the point c if (1) f is defined
at c, (2) lim f(x) exists, and (3) lim f(x) = f(c).
         x->c                   x->C
    Corresponding to one-sided limits, we have the notions of continuity from the left and
from the right.
Definition We say that a function f is continuous from the left at a point c if

                                   lim  f (x) = f (c).                          (2.4.2)
                                   x-->c-
We say that a function f is continuous from the right at a point c if

                                   lim  f (x) = f (c).                          (2.4.3)
                                   x->c+
    Simply to say that a function f is continuous, without specifying some particular point,
means that the function is continuous, in the proper sense, at all points where it is defined.
Here "in the proper sense" means, for example, that if f is defined only on a closed interval
[a, b], then we cannot ask for continuity at a or b, since it is possible to discuss only one-
sided limits at these points, but it is possible to inquire about continuity from the right at
a and continuity from the left at b.
Definition We say a function f is continuous on the open interval (a, b) if f is continuous
at every point in (a, b). We say f is continuous on the closed interval [a, b] if f is continuous
on (a, b), continuous from the right at a, and continuous form the left at b.
    In the previous section we saw that if f and g are polynomials and c is a point with
g(c) / 0, then
                                   li_ f(x) _ f(c)
                                   x->c g(x)  g(c)


1


Copyright @ by Dan Sloughter 2000


﻿


2


Continuous Functions


Section 2.4


The following proposition restates this fact in terms of our new definitions.
Proposition If h is a rational function and h is defined at the point c, then h is contin-
uous at c. In particular, if h is a polynomial, then h is continuous on the entire real line
(-oc, oo).
    This theorem gives us a very large class of functions which we know to be continuous.
As we progress, we will add many more types of functions to this class.
Example     Consider the function f(x) = 3x3 - 6x + 3. Since f is a polynomial, it is
continuous on (-oc, oc). That is, for any real number c, f is continuous at c.
Example Consider
                                          8t - 13t2
                                            3t -4
Then g is a rational function, and so is continuous at all points in its domain. That is, g
is continuous for all real numbers c except c 4. Put another way, g is continuous on the
intervals (-oo, 4) and (3, oo).
Example Suppose
                              h(z)  {z2-2, if z<1,
                                       4z-2, ifz> 1.
On the interval (-oc, 1], h is a polynomial; thus h is continuous on (-oc, 1]. Note that
this does not necessarily mean that h is continuous at 1, only that h is continuous from
the left at 1. Similarly, on the interval (1, oc), h is a polynomial and hence is continuous
on (1, oc). To check for continuity at 1, we note that

                             lim h(z) = lim (z2 - 2) = -1,
                             z - 1      z - 1

while
                              lim h(z) = lim (4z - 2) = 2.
                              z-1+       z-1+
Since these limits are different, we know that lim h(z) does not exist. Thus h is not
                                                z-1
continuous at 1. As we saw in Section 2.3, this behavior results in a break in the graph of
h at z = 1. See Figure 2.4.1.
Example Suppose
                                      fs+1,     ifs<0,
                              f (s) = 82
                                        s2+1, if s>0.
Similar to the situation in the previous example, f is continuous on the intervals (-oc, 0)
and [0, oc) since it is a polynomial on both of these intervals. Now


and


﻿


Section 2.4


Continuous Functions


3


12

10

8

6

4

2


-2       1                       2       3       4


         Figure 2.4.1 Graph of y = h(z)


10


8

6

4

2


1       2       3


-2


                            Figure 2.4.2 Graph of y = f(s)


Thus lim f(s) = 1; since f(O) = 1, we have
      s-0O


lim f (s)
s-~o


1 =f().


Thus f is continuous at 0. Altogether this shows that f is continuous on the entire interval
(-oc, oc). As we see in Figure 2.4.2, the graph of f does not have a break at s = 0.


Example Suppose


          { 2 -  ifx# 2,
g(x)f=  x -2
         6,        if x= 2.


Then, since g is a rational function on the intervals (-oc, 2) and (2, oc), and is defined
throughout these intervals, g is continuous on the intervals (-oc, 2) and (2, oc). To check


﻿


4


Continuous Functions


Section 2.4


                           8


                           6


                           4


                   -1                1       2       3       4        5

                             Figure 2.4.3 Graph of y = g(x)


for continuity at 2, we notice that

                             x2 - 4        (x -2)(x +2)
              lim g(x) =nim         = lim                 = lim (x + 2) = 4,
              x-2        x-2 x - 2     x-2     x - 2        x-2

while g(2) = 6. Hence lim g(x) # g(2), and so g is not continuous at 2. See Figure 2.4.3.
                       x-2
    It is interesting to note that in the last example the function g could be made continuous
if its value at 2 were changed from 6 to 4. In general, if, for a function f and a point c,
lim f (x) = L, but f is not continuous at c because either f is not defined at c or f (c) # L,
x- c
we can define a new function h such that h(x) = f(x) for all x # c and h is continuous at
c. Namely, if we let
                                h(x    {ff(x), ifx5#c,
                                         L,     if x =c,

then h(x) = f(x) for all x # c and

                             lim h(x) = lim f (x) = L = h(c).
                             x-c        x-c

Thus h is a function which is identical to f everywhere except at c, but, unlike f, is
continuous at c. In this case we say that f has a removable discontinuity at c. Note that
the existence of a limit at c is essential in order for a discontinuity at c to be removable.
    The following proposition lists some properties of continuous functions, all of which
are consequences of our results about limits in Section 2.3.

Proposition Suppose the functions f and g are both continuous at a point c and k is a
constant. Then the functions which take on the following values for a variable x are also
continuous at c:


kf (x),


(2.4.4)


﻿


Section 2.4                           Continuous Functions                            5

                                      f(x) +g(x),                                (2.4.5)

                                      f(x) - g(x),                               (2.4.6)
                                      f(x)g(x),                                  (2.4.7)

                                                                                 (2.4.8)

provided g(c) # 0, and
                                        (f(x))P,                                 (2.4.9)

provided p is a rational number and (f(x))P is defined on an open interval containing c.
Example     It follows from (2.4.9) that functions of the form f(x) = xp, where p is a
rational number, are continuous throughout their domain. For example, f (x) =_ V/is
continuous on [0, oo).
Example Using (2.4.8) and (2.4.9),

                                            /3t + 2
                                    g(t)      2t

is continuous for all points t where 3t + 2 > 0 and t # 0. Thus g is continuous on the
intervals [- 3, 0) and (0, oo).
    At this point we have the tools necessary to determine questions of continuity for
algebraic functions. We will now show that the sine and cosine functions are continuous
on (-oo, oo). For 0 < x < 2 , consider the point C = (cos(x), sin(x)) on the unit circle
centered at the origin. If we let A = (0, 0) and B = (1, 0), as in Figure 2.4.4, then the area
of AABC is
                                        1
                                        - sin(x).
                                        2
The area of the sector of the circle cut off by the arc from B to C is the fraction 2 of the
area of the entire circle; hence, this area is

                                       -7    = -
                                       2wr     2

Since this sector contains IABC, we have

                                        1         cc
                                   0 < - sin(cc) < -,
                                        2         2

from which it follows that
                                    0 < sin(c) <cc.

Since
                                       lim cc= 0,


﻿


6


Continuous Functions


(0, 1)


Section 2.4


Figure 2.4.4


it follows that


lim sin(x)
x-0+


0.


Moreover, we also have


lim sin(x)


lim sin(-x)


  lim sin(x)
  x-0


lim sin(x)
x-0+


0,


so


0.


(2.4.10)


Since sin(0)


0, this shows that sine is continuous at 0. Now for - 2 <cc < 2


cos(x) =    1 - sin2 (X).


Hence


lim cos(x)
x-~0


lim 1 - sin2(x)
x-~0


1 - lim sin2 (x)
    x-0


1.


(2.4.11)


Since cos(0) = 1, this shows that cosine is continuous at 0. For an arbitrary c, we have,
using the angle addition formulas for sine and cosine,


lim sin(x)
x- c


lim sin(c + h)
h-0
lim (sin(c) cos(h) + cos(c) sin(h))
h-0
sin(c) lim cos(h) + cos(c) lim sin(h)
      h-0                  h-0
sin(c)(1) + cos(c)(0)
sin(c)


﻿


Section 2.4


Continuous Functions


7


and
                    lim cos(x) = lim cos(c + h)
                    x-c         h-0
                                =rn (cos(c) cos(h) - sin(c) sin(h))
                                h-0O
                              = cos(c) lim cos(h) - sin(c) lim sin(h)-
                                      h-0                h-0
                              = cos(c)(1) - sin(c)(0)
                              = cos(c)

Thus we have the following proposition.

Proposition The sine and cosine functions are continuous on (-oc, oo).

    The next proposition is then an immediate consequence of (2.4.8).

Proposition The tangent, cotangent, secant and cosecant functions are continuous at
all points in their respective domains.

    We have not yet considered the composition of continuous functions. Suppose g is
continuous at c and f is continuous at g(c). If {xn} is a sequence converging to c, then
we know, since g is continuous at c, that the sequence {g(xn)} will converge to g(c). But
then, since f is continuous at g(c), the sequence {f(g(xn))} will converge to f(g(c)). That
is,
                     lim f o g(x) = lim f(g(x)) = f(g(c)) = f o g(c).      (2.4.12)
                     x-c           x-c

Hence f o g is continuous at c.

Proposition If g is continuous at c and f is continuous at g(c), then f o g is continuous
at c.

Example The function h(t) = cos(3t + 4) is continuous on (-oc, oc) since it is the
composition of the functions g(t) = 3t +4 and f(t) =cos(t), both of which are continuous
on (-oo, oo).

Example Consider the function


                                    g t)-sin(t2 + 1)
                                  g (t)=
                                              t

Now the numerator of g is continuous on (-oo, oo) since it is the composition of h(t) =t2+1
and f(t) =sin(t), both of which are continuous on (-oo, oo). It follows that, since the
denominator of g is continuous on (-oo, oo), g is continuous at all points for which the
denominator is not equal to zero, that is, for all t # 0. Thus g is continuous on the intervals
(-oo,0) and (0, oc).

    In Section 2.5 we will consider two properties of continuous functions which partially
explain the important role they play in calculus.


﻿


8                           Continuous Functions

Problems

1. Discuss the continuity of the given function at the specified point.


Section 2.4


(a) f (t) = 3t2 - 6 at t = 2


(b) f (x) = ;2x  6atx
            x+1

(d) h(s) = s      at s =  


= 17

1


           2x + 5
           x-16
           s2_1
(e) h(s) =s+1 at s =


= 16


1


2. Discuss the continuity of the function


       ( 4t -1,
g(t) = t + 5,


if t <2,
if t>2,


   at t = 2.

3. Discuss the continuity of the following functions.


(a) g(x) = 4x23 - w18 + 16x - 3

                 8
(c) g(t) = 32t - -
                 t


            2- t-6
(b) f (t) = t

              8
(d) f (u) =24

(f) g(w)=  1
             9-x2


4. Discuss the continuity of the function


                                    (3x + 2,
                                        3x + 1,

5. Discuss the continuity of the function


if x <
if x>


1,
1.


h(z) {z2
         z -


-1, ifz<
1,   ifz>


1,
1.


6. The function


ft=t2 - 7t + 12
          t-4


   is not continuous at t = 4. Is this discontinuity removable? If it is, define a new
   function g which agrees with f whenever t # 4, but is continuous at 4.

7. The function
                                   f M    t2 - 7t + 12
                                              t-5
   is not continuous at t = 5. Is this discontinuity removable? If it is, define a new
   function g which agrees with f whenever t # 5, but is continuous at 5.


﻿


Section 2.4


Continuous Functions


9


8. Explain why g(x) = x2 sin(x2 + 1) is continuous on (-oc, oc).
9. Recall that the Heaviside function is defined by


H(t)  {
          1,


if t <0,
if t>0.


    (a) Discuss the continuity of f (t)
    (b) Discuss the continuity of g(t)
    (c) Discuss the continuity of h(t)
10. Discuss the continuity of f (x) =
11. Discuss the continuity of f (x) =


= H(t2 + 1). Graph f on the interval [-5, 5].
= H(t2 - 1). Graph g on the interval [-5, 5].
= H(sin(7t)). Graph h on the interval [-5, 5].
_x] and g(x) =I[x].
_sin(x)] and g(x) =_|sin(x)1.


﻿


Section 2.5


                to                    Some Consequences
       Differential Equations         Of Continuity


In this section we consider two properties of functions which are very closely connected
to the notion of continuity. The first of these, the Intermediate Value Theorem, says that
the graph of a continuous function is a connected continuum in the sense of our normal
intuition. That is, the theorem states that as a continuous function changes from one value
to another, it must take on every intermediate value. The second theorem, the Extreme
Value Theorem, says that a continuous function on a closed interval attains a maximum
and a minimum value on that interval. This is related to our intuitive notion that if we
draw a continuous curve with definite beginning and ending points, then the curve has
a point where it is higher than at any other point and a point where it is lower than
at any other point. We shall not attempt formal justifications of these theorems; such
justifications require inquiries into the subtleties of real numbers which are best left to
more advanced courses.
    We will begin with a statement of the Intermediate Value Theorem, followed by a
consideration of its application to solving equations.
Intermediate Value Theorem If f is a continuous function on a closed interval [a, b]
and m is any number between f(a) and f(b), then there is a number c in the interval [a, b]
such that f (c) =:m.
Example     Since f(t) = sin(t) is continuous on [0, 2] with f(0) = 0 and f(2) = 1, the
fact that 0 < A < 1 guarantees that there is a number c in [0, 2 ] such that

                                             5
                                     f(c) = .
                                            27
Graphically, the situation is as in Figure 2.5.1. Of course, the theorem tells us neither
the value of c nor how we might find it. The Intermediate Value Theorem is an existence
theorem; it guarantees the existence of a certain value, but does not directly provide any
method for calculating the value.
Example     Suppose f(t) is the height, in inches, of a certain plant t days after it first
emerges from the soil. From our knowledge of how plants grow, it would be reasonable
to assume that f is a continuous function. Also, we have f(0) = 0. Now if f(10) = 12,
then we know, for example, that there is some time c, 0 < c < 10, such that f(c) = 5.
Of course, this is not surprising, and, in fact, we did not have to bring the subject of
continuous functions into the problem in order to realize that between the time when the
plant was 0 inches tall and the time when it was 12 inches there was a time when it
was 5 inches tall. However, the point of an example like this is to emphasize that the
Intermediate Value Theorem simply states a property that we should expect continuous


1


Copyright @ by Dan Sloughter 2000


﻿


2


Some Consequences Of Continuity


Section 2.5


                       5
                   0.8   -    -      --

                   0.6

                   0.4

                   0.2


                            0.25   0.5    0.75  C  1     1.25   1.5
                 Figure 2.5.1 Intermediate Value Theorem: sin(c)


functions to have if they are to be used as mathematical models of real world processes
that undergo continuous change.
    As a special case, the Intermediate Value Theorem tells us that if f is a continuous
function on a closed interval [a, b] with f(a) and f(b) having opposite signs (that is, one
is negative and the other positive), then there is a point c in the open interval (a, b)
where f(c) = 0. In other words, under these conditions, the Intermediate Value Theorem
guarantees that the equation f(x) = 0. has at least one solution in [a, b]. Although the
theorem does not provide a method for solving the equation, it does provides a basis for
constructing an algorithm for approximating a solution to any desired accuracy.
Bisection Algorithm    Suppose f is continuous on [ai, b1] and f(ai)f(bi) < 0 (an easy
way to check that f(ai) and f(b1) have opposite signs). Then, as above, the equation

                                       f (x) = 0                                 (2.5.1)

has at least one solution in [ai, b1]. Let

                                          a1 + b1
                                    mi =      2

If f(mi) = 0, then we have found a solution to (2.5.1). If f(m1) # 0, then either

                                    f(ai)f(mi) < 0,

in which case (2.5.1) has a solution in [ai, m1], or

                                    f(mni)f(bi) < 0,

in which case (2.5.1) has a solution in [mni, bi]. In the first case, let a2 =a1 and b2=m1
in the second case, let a2 =mi and b2 =bi. Then

                                       m 2    + b                                (2.5.2)


﻿


Section 2.5


Some Consequences Of Continuity


3


will approximate a solution to (2.5.1) with an error less than

                                      b2- a2
                                         2

Proceed in a the same manner to define an, bn, and ma for n
have found an_1, b_1, and ms_1, and f(mn_1) # 0, let


(2.5.3)


3, 4,5,.... That is, if we


an = an_1 and b =1 m_1 if f (ani)f (mni) <0


an = mn_1 and b = bn_1 if f (mn_1)f (bn_1) < 0.


and


(2.5.4)


(2.5.5)


(2.5.6)


Then
                                         an + bn
                                     12     2

will approximate a solution of (2.5.1) with an error less than


b - an
   2


(2.5.7)


Repeat the procedure as many times as necessary to obtain the desired level of accuracy.


Example Suppose we wish to find a root to the equation

                                     X5 +zX=1.

First note that solving (2.5.8) is equivalent to solving

                                   x5 +x-1=0.

Letting
                                 f(x) = x5 + X-1,


(2.5.8)


(2.5.9)


we may write (2.5.9) as f(x)
Figure 2.5.2. Noting that f (0)
That is, (2.5.8) has a solution in


0. To find an initial interval [ai, bi], we graph f as in
-1 and f (1) = 1, we may start with ai1= 0 and bi = 1.
the [0, 1]. Then


721 =0 + 1-0.5.
        2


Now f (0.5)


-0.468750, so f(0.5)f (1) <0. Hence a2 = 0.5, b2 = 1, and


      0.5 + 1.0
m2               0.75.
          2


Now f (0.75)


-0.012695, so f(0.75)f (1) <0. Hence a3= 0.75, b3 =1, and


      0.75 + 1.00
m3 =
           2


0.875.


﻿


4


Some Consequences Of Continuity


Section 2.5


6


4


2


-3     -2


1      2      3


Figure 2.5.2 Graph of f (x) = x5 + X - 1


At this stage we know
error of no more than


that 0.875 is an approximation for a solution to (2.5.8) with an

          1.00 - 0.75
                     = 0.125.
               2


We may continue in this manner until we attain any desired level of accuracy.
table gives the values of an and bn for n = 1, 2, 3, ... ,10.


    a 0
0.000000000
0.500000000
0.750000000
0.750000000
0.750000000
0.750000000
0.750000000
0.750000000
0.753906250
0.753906250


    b 0
1.000000000
1.000000000
1.000000000
0.875000000
0.812500000
0.781250000
0.765625000
0.757812500
0.757812500
0.755859380


    m  
0.500000000
0.750000000
0.875000000
0.812500000
0.781250000
0.765625000
0.757812500
0.753906250
0.755859380
0.754882815


   f(an)
-1.000000000
-0.468750000
-0.012695300
-0.012695300
-0.012695300
-0.012695300
-0.012695300
-0.012695300
-0.002544540
-0.002544540


    f(bs)
1.000000000
1.000000000
1.000000000
0.387909000
0.166593000
0.072288300
0.028700600
0.007736990
0.007736990
0.002579770


The following


   f(mh)
-0.468750000
-0.012695300
0.387909000
0.166593000
0.072288300
0.028700600
0.007736990
-0.002544540
0.002579770


Rounding to three decimal places, we see that x = 0.755 approximates a solution of (2.5.8)
with an error of no more than, to three decimal places,

                               0.756 - 0.754
                                            = 0.001.
                                    2


    In Section 3.6 we will discuss another method, called Newton's method, for approx-
imating a solution to an equation of the form f(x) = 0. At that time we will see that
Newton's method is faster than the bisection algorithm. However, we will also see that
there are conditions under which Newton's method will fail, whereas the bisection algo-
rithm will always work.


﻿


Section 2.5


Some Consequences Of Continuity


5


                                 10

                                 8

                                 6

                                 4

                                 2


                  -2       -1              1       2       3       4

                       Figure 2.5.3 Graph of f(x) = x2 on [-1, 3]


    We know turn to the Extreme Value Theorem and some of its consequences.

Extreme Value Theorem If f is a continuous function on a closed interval [a, b], then
there exists a point c in [a, b] such that f(c) > f(x) for all values of x in [a, b]. Similarly,
there exists a point d in [a, b] such that f(d) <; f(x) for all values of x in [a, b].

    In other words, using the notation of the statement of the theorem, f(c) is the maxi-
mum value attained by f on [a, b] and f(d) is the minimum value attained by f on [a, b].
As with the Intermediate Value Theorem, this is an existence theorem which does not
indicate any method for finding the points c and d. The importance of the theorem lies in
the fact that it gives conditions under which maximum and minimum values of a function
are guaranteed to exist. Optimization problems, that is, problems concerned with finding
the maximum and minimum values of functions, occur frequently in mathematics and in
the applications of mathematics. As we shall see in Section 3.8, conditions which guarantee
the existence of a solution to an optimization problem, such as those given in the Extreme
Value Theorem, are often an important first step in solving such problems.

Example Consider f (x) = x2 on the interval [-1, 3]. Since f is a continuous function on
this closed interval, the Extreme Value Theorem guarantees the existence of a maximum
value and a minimum value for f. In fact, from our knowledge of the behavior of this
function, in particular that f(0) = 0, f(x) > 0 if x # 0, and f(x) > f(y) if Iz| > lyl, it is
easy to see that f(x) attains its maximum value when x = 3 and its minimum value when
x = 0 (see Figure 2.5.3). Hence the maximum value of f on [-1, 3] is 9 when x = 3 and
the minimum value is 0 when cc= 0.

Example Let A, B, and C be constants with A > 0. Suppose we wish to find the
minimum value of the quadratic polynomial

                                f(cc) =Acc2 + Bcc + C                          (2.5.10)


on an interval [a, b]. Completing the square, we may rewrite f as


﻿


6                    Some Consequences Of Continuity                     Section 2.5


                        f(x) =Ax2+Bx+C


                                         BY C


                                         2A      4A2    A)

                                         B 2jB2


Since C - B is a constant, f(x) is minimized when A (x + B)2 is minimized. This latter
term is never negative and is minimized when it is 0, that is, when

                                         B
                                    cc+ = 0.
                                      x 2A=0

Hence the minimum value of f(x) on [a, b] will occur when

                                            B
                                     x=     A,(2.5.11)

unless this point is not in the interval, in which case the minimum value occurs at one of
the endpoints, x = a or x = b. Note that, geometrically, (2.5.11) is the location of the
vertex of the parabola which is the graph of f. Note that if A < 0, then the maximum
value of f (x) would occur at (2.5.11) if it is in the interval [a, b], and at one of the endpoints
otherwise.


                          y


                                           x
                     Figure 2.5.4 A field of length x and width y


Example Suppose we wish to fence in a rectangular field with 500 yards of fencing in
such a way that we maximize the area of the resulting field. If, as in Figure 2.5.4, we let
cc denote the length of the field, y its width, and A its area, then

                                      A =czy.

Moreover, since we only have 500 yards of fencing to work with, we know that


2x + 2y = 500.


﻿


Section 2.5


Some Consequences Of Continuity


7


                   16000
                   14000
                   12000
                   10000
                   8000
                   6000
                   4000
                   2000

                                50      100     150      200     250
                    Figure 2.5.5 Graph of A = 250x - x2 on [0, 25]


Hence
                                     y = 250- x,                               (2.5.12)

from which it follows that

                          A =cy = x(250 - x) =250x - x2.

From (2.5.12), and the fact that we must have both x ;> 0 and y > 0, it follows that
0 < x < 250. Thus our problem becomes one of finding the maximum value of

                                   A = -z2 + 250x

on the closed interval [0, 250]. From our previous example, the maximum value of A will
occur when
                                     250      250
                             X = (2)(-1) =2 = 125.
From (2.5.12), we have y = 125 when x = 125. Hence the area of the field is maximized
when its dimensions are 125 yards by 125 yards. For these dimensions, the area of the field
is
                     A    12 =(125)(125) = 15, 625 square yards.

See Figure 2.5.5 for the graph of A.
Example Consider the function f (x) = x2 + 1 on the open interval (0, 1). Then


and


but 0 < f(cc) < 2 for all values of cc in (0, 1). Hence, as cc approaches 0 from the right,
f(cc) approaches, but never reaches, 1; similarly, as cc approaches 1 from the left, f(cc)


﻿


8


Some Consequences Of Continuity


Section 2.5


approaches, but never reaches, 2. Thus f is an example of a continuous function on an
open interval which attains neither a maximum nor a minimum value on the interval.
Hence we see why the interval in the statement of the Extreme Value Theorem must be a
closed interval.

Problems

1. Use the bisection algorithm to approximate a solution to each of the following equations
    on the given interval. Your answer should have an error of no more than 0.005.
    (a) x2 - 2 = 0 on [0, 4]                   (b) x5 - 6x3 + 2x = 2 on [-1, 1]
    (c) cos(x) = x on [0, r]                   (d) 2 sin(x) x=/ + 1 on [0, 2]
 2. (a) Plot the graph of g(t) = t2 - cos2(t) on [-r, r].
    (b) How many solutions are there to the equation t2 = cos2(t)?
    (c) Use the bisection algorithm to estimate the solutions to the equation t2 =cos2(t).
        State your answers with an error less than 0.005.
 3. Suppose if the market price for a certain product is p dollars, then the demand for
    that product will be
                                       50000p + 10000).
                                D~p) =2               units.
    At the same time, suppose that at a price of p dollars producers will be willing to
    supply
                                  S(p) = -p2 + 2p units.
                                          3
    (a) Plot the graphs of D and S on the same graph.
    (b) Use the bisection algorithm to estimate the solution to the equation

                                         D(p)    S(p).

        This point is called the equilibrium price because it is the price for which the
        consumers' demand for the product is exactly equal to the manufacturers' supply.
    (c) How many units of the product will be manufactured at the equilibrium price?
    (d) What would happen if the producers raised the price above the equilibrium price?
       What would happen if they lowered the price below the equilibrium price?
    (e) What would happen if the producers increased production? What would happen
        if they lowered production?
 4. A farmer wishes to fence in a rectangular field, using a straight river for one side, with
    500 yards of fencing. What should the dimensions of the field be in order to maximize
    the area of the field?
 5. When a potter sells his pots for p dollars apiece, he can sell D(p) =750 - 50p of them.
    Suppose the pots cost him $5.00 apiece to make. What price should the potter charge
    in order to maximize his profit?


﻿


Section 2.5


Some Consequences Of Continuity


9


6. Let h(t) = t4 - 1.
    (a) Does h have a maximum value on [-1, 2)?
    (b) Does h have a minimum value on [-1, 2)?
    (c) Are the results of (a) and (b) consistent with the Extreme Value Theorem? Ex-
       plain.
 7. Recall that the Heavidside function is defined by

                                  H(t)=    0, if t < 0,
                                           1, if t> 0.

    (a) Note that H(-1) = 0 and H(1) = 1. Is there a point c in (-1, 1) such that
       H(c) - 0.5?
    (b) Is the result of (a) consistent with the Intermediate Value Theorem?
    (c) Does H attain a maximum value on [-1, 1]? Does H attain a minimum value on
        [-1, 1]?
    (d) Are the results of (c) consistent with the Extreme Value Theorem?
 8. Suppose g is defined on [-1, 1] by

                                     {| 1t|    ft, if t #0,


    (a) Does g attain a maximum value on [-1, 1]? If so, at what points?
    (b) Does g attain a minimum value on [-1, 1]? If so, at what points?
    (c) Are the results of (a) and (b) consistent with the Extreme Value Theorem?
 9. Suppose f and g are continuous on [0, 1], f(0) < g(0) and f(1) > g(1). Show that
    there exists a point c in the open interval (0, 1) such that f(c) = g(c).
10. Suppose f is continuous on [0, 1] and 0 < f(x) < 1 for all x in [0, 1]. Show that there
    exists a point c in [0, 1] such that f(c) = c.


﻿


Section 3.1


       Differential Equations         Best Affine Approximations


We are now in a position to discuss the two central problems of calculus as mentioned in
Section 1.1. In this chapter we will take up the problem of finding tangent lines; in Chapter
4 we will consider the problem of finding areas. We choose this order only because the
work we do in solving the tangent line problem in this chapter will be of use, through the
Fundamental Theorem of Calculus, in solving area problems in the next.
    We begin with some preliminary notation and terminology. If f is a function with
domain contained in the set A and range contained in the set B, then we may denote this
fact by writing f : A -- B. For example, if g(t) = 1 - t2 and R denotes the set of real
numbers, then the statements g : R -- R, g : [-1, 1] -- R, and g : [-1, 1] -- [0, 1] are all
correct. We will work exclusively with functions of the form f : R - R until Chapter 7,
where we will introduce functions of the form f : R -- C and f : C - C, where C denotes
the set of complex numbers.
    We call a function f : R -- R linear if there is a constant m such that f(x) = mx for
all values of x. Graphically, linear functions are functions whose graphs are straight lines
passing through the origin. We call a function f : R -- R affine if there are constants
m and b such that f(x) = mx + b for all values of x. Graphically, affine functions are
functions whose graphs are straight lines, not necessarily passing through the origin. Put
another way, an affine function is a first degree polynomial. Thus f(x) = 3x is both linear
and affine, whereas g(t) = 4t - 6 is affine but not linear.
    The problem of finding the tangent line for the graph of a given function f at a point
(xo, yo) is really the problem of finding the affine function T which best approximates
f for points close to xo. In this section we will discuss how to solve this problem. In
the remaining sections of this chapter we will consider techniques for finding best affine
approximations and discuss some applications. In Chapter 5 we will see how to improve
upon affine approximations by using higher degree polynomials.
    The following example should help to make these ideas more concrete.
Example     Consider the problem of approximating the function f(x) =  x/2 for values of
x close to 1. For a first approximation, we might say that
                                        x ~ 1
for x close to 1. In other words, if we let
                                      T(x)= 1
for all x, then we are saying that the affine function T is a good approximation for f when
x is close to 1. Two facts characterize this statement. First, T and f agree at 1; that is,
                                   T(1) = 1 = f (1).                           (3.1.1)


1


Copyright @ by Dan Sloughter 2000


﻿


2


Best Affine Approximations


Section 3.1


                    2


                    1.5 .r(x)


                    1.


                    0.5


                                 1           2          3           4
         Figure 3.1.1 Graph of f(x) =  and an approximating affine function


Second, the error committed by approximating f by T goes to 0 as x approaches 1. That
is, if we let
                                  r(x) = f(x) - T(x),
then r(x) is the error made in approximating f by T at the point x, and

                lim r (x) = lim (f (x) - T(x)) = lim ( - 1) = 1 - 1 = 0.   (3.1.2)
                x- 1       x- 1                x- 1
Hence we have found an affine function which approximates our function f about x = 1
according to some reasonable criterion.
    However, it is easy to see that any affine function T whose graph passes through (1, 1)
will satisfy (3.1.1) and (3.1.2). First note that if the graph of T is a straight line passing
through (1, 1) with slope m, then, using the point-slope form for the equation of a line,

                                 T(x) = m(x - 1) + 1.

It then follows that
                                    T (1) = 1 = f (1)
and, if we again let r(x) = f(x) - T(x),

         lim r(x) = lim (f (x) - T(x)) = lim (V  - (m(x - 1) + 1)) = 1 - 1 = 0.
         x- 1       x- 1                x- 1
See Figure 3.1.1 for the geometrical interpretation. So now we must ask if there is a value
of m which makes T, in some sense, better than any other affine function for approximating
f for x near 1. In answering this question, it is convenient to let h =x - 1 and to define

                        R(h) =r(1 + h) =f(1 + h) - T(1 + h),

the amount of error committed when f is approximated by T at a point a distance h from
1. Since h approaches 0 as x approaches 1, (3.1.1) and (3.1.2) become, in terms of R,


R(0)    0


(3.1.3)


﻿


Section 3.1


Best Affine Approximations


3


                   0.01

                   0.008

                   0.006

                   0.004

                   0.002
                                                          m = 0.5

                      -0.2       -0.1        0         0.1       0.2
                 Figure 3.1.2 Graphs of R(h) for different values of m


and
                                    lim R(h) = 0.                              (3.1.4)
                                    h-0O
In this case, we have

              R(h) =   1+h-(m((1+h)-1)+1) =            l+h-(mh+1).

Figure 3.1.2 shows the graphs of |R(h)| on the interval [-0.2,0.2] for m = 0.1, 0.3, 0.4,
0.5, 0.6, 0.7, and 0.9. Note that although all these functions approach 0 as h approaches 0,
one of the graphs clearly stands out from the others. Namely, when m = 0.5, the absolute
value of the approximation error appears to approach 0 at a significantly faster rate than
does the error for other values of m. To see why this is so, consider that

                    R(h) = -\1-+ h - (mh + 1)
                                                 1 + h + (mh + 1)
                           ( 1 + h - (mh + 1))

                           1 + h-(mh+ 1)2
                              1+h+mh+1
                           1 + h - (m2h2 + 2mh + 1)
                                  1+h+mh+1
                           h(1 - 2m) - m2h2
                              1 + h+mh+ 1

Note that when m =0.5, the numerator reduces to -mn2h2, whereas for other values of m
there is also the term h(1 - 2mn). This explains why in Figure 3.1.2 the graph for m =0.5
looks parabolic while the other graphs appear more as straight lines. Moreover, since, for
small values of h, h2 is significantly smaller than h (for example, (0.001)2 =0.000001 is
much smaller than 0.001), we see why the approximation errors when m =0.5 are so much
smaller than they are for other values of m.


﻿


4                        Best Affine Approximations                         Section 3.1

    Intuitively, we should think that for small values of h, R(h) behaves like a multiple of
h when m # 0.5 and like a multiple of h2 when m = 0.5. To see this algebraically, it is
useful to consider the quotient

                               R(h)      1 - 2m - m2h
                                 h      /1 + h+mh+ 1

Notice that
                                  lim R(h)    1 - 2m
                                  h-O   h        2
which is 0 only when m = 0.5. We interpret this as an indication that R(h) behaves like
a multiple of h for small values of h when m # 0.5, but approaches 0 more rapidly than h
when m = 0.5.
    In our example, we saw that
                                     lim R(h)=_
                                     h-O   h
when m = 0.5, but
                                     lim R(h)    0
                                     h-O h
for all other values of m. We distinguish the two cases by saying that in the first case R(h)
is o(h), whereas in the second case R(h) is only O(h).
Definition A function f is said to be o(h) if


                                     lim f(h)= 0.                                (3.1.5)
                                     h-O   h

Definition A function f is said to be O(h) if there exist constants M and c> 0 such
that
                                       f(h) <M                                    (3.1.6)
                                         h
whenever -E < h < E.
    Note that if
                                      . f(h)
                                      h-0O h
then we may find an ec> 0 such that

                                      f(h)L<
                                      h

whenever |hl < c. Hence
                                          f(h)
                                 L -1<         <L+1


﻿


Section 3.1


Best Affine Approximations


5


                                         0.4


                                         0.2


                       -0.4      -0.2                0.2       0.4

                                        -0.2


      Figure 3.1.3 Rates of convergence to 0 of f(x) = x2, g(x) = x, and k(x) = x3

whenever |h <cE. If we let M be the larger of |L - 1| and |L +11, then this shows that
                                       f(h) <M
                                         h -
whenever -E < h < E. Hence we have the following proposition.
                      f(h)
Proposition     If lim      exists, then f is O(h).
                  h-O h
    Note that a function which is o(h) is also O(h). Intuitively, we think of a function
which is o(h) as approaching 0 faster than h as h goes to 0, and a function which is O(h)
as approaching 0 at a rate which is at least as fast as that of h.
Example Let f (x) = x2, g(x) = x, and k(x) = x3. Then
                                 lim f (h) = lim h2 = 0,
                                 h-0        h-0
                                 lim g(h) = lim h = 0,
                                 h-0        h-0
and
                                 lim k(h) = lim h3 = 0.
                                 h-0        h-0
However,
                                f (h) =h2
                            lim       = lim    = lim h= 0,
                            h-0 h       h-0 h     h-0
so f is o(h);
                                 g(h) _      h
                             h-0O h     h-0O h   h- 0
so g is 0(h), but not o(h); and

                          lim k (h)  i m h  = lim 12   0
                          h-0O  h     h-0O h    h-Ohi
so k is neither o(h) nor 0(h). Note in Figure 3.1.3 the difference in the way in which these
functions approach 0.


﻿


6


Best Affine Approximations


Section 3.1


Example


Let f (x)


: x - x2. Then


hm f(h)
h-o h


    h h-2
lim  h
h-0A h


lim (1
h-0


h) 1,


so f is O(h), but not o(h).


Example


Let g(x) = 2 1 + x


x - 2. Then


.rn g(h)
h-o  h


    2/1+h-h-2
h-0        h

. (21+h-(h+2)
h-0O         h        J

. 4(1 + h) - (h + 2)2
hh(mh
h-0 h(2 1 + h + (h + 2))


(21+h+(h+2)
2 1+h+(h+2))


. 4+4h -(h2+4h+4)
h-0 h(2 1+h+h+2)
   . -h2
h-0Oh(2 1 +h +h +2)
.r        -h
h o0 2 1 - + h 2
h- o2 1+h+h+2
0
- = 0.
4


Thus g is o(h).


Example


Returning to the problem of approximating f(x)


/x for x close to 1, let


T(x) = m(x - 1) + 1


and


R(h) = f (I + h) - T(I + h).


We saw above that


lim R(h)
h-o   h


1 - 2m
   2


Thus


R(h)
=Xi


is o(h) if and only if m
by the affine function


0.5. In other words, the error in approximating


1
2


1)+1


(3.1.7)


goes to 0 faster as x approaches 1 than the error for any other affine function approximation.
Because of this, we will call (3.1.7) the best affine approximation to f at 1. Moreover, we


﻿


Section 3.1


Best Affine Approximations


7


                   2.5

                   2

                   1.5

                   1

                   0.5


                                 1          2          3           4

             Figure 3.1.4 Graph of f(x)   Vx and its tangent line at (1, 1)


will call the graph of T, which is a straight line through (1, 1) with slope 0.5, the tangent
line to the graph of f at (1, 1). See Figure 3.1.4.
    As an example of using T to approximate f, note that, to 4 decimal places,

                                      1.1 = 1.0488,

while
                                     1
                            T(1.1) =-(1.1 - 1) + 1 =1.05,
                                     2
giving a remainder of only

                         R(0.1) = 1.0488 - 1.0500 = -0.0012.

This approximation is remarkably accurate considering the simplicity of the calculations
used to obtain it. Of course, we expect the accuracy of the approximation to increase as
h decreases. For example, to 4 decimal places,

                                     1.05 = 1.0247,

while
                                     1
                          T(1.05) =-(1.05 - 1) + 1 = 1.025,
                                     2
giving a remainder of only

                         R(0.05) =1.0247 - 1.0250 =-0.0003.

Note that when we decreased h from 0.1 to 0.05, a factor of1} h ro   etfo  001
to -0.003, a factor of 1 This is evidence of the quadratic nature of the error, the fact
that R(h) is approaching 0 like h2, not like h.

    Using the ideas of this example, we may now make the following definition.


﻿


8


Best Affine Approximations


Section 3.1


Definition Let f be a function defined in an open interval about a point c. If T is an
affine function such that T(c) = f(c) and

                              R(h) = f (c + h) - T(c + h)

is o(h), then we call T the best affine approximation to f at c. Moreover, the graph of T
is called the tangent line to the graph of f at (c, f(c)).
    Using the point-slope form for the equation of a line, the equation of the tangent line
at (c, f(c)) may be written in the form

                                  y - f (c) = m(x - c)

for some slope m. That is,
                                 y = m(cx - c) + f (c),
or, in other words, the best affine approximation has the form

                                T(x) = m(x - c) + f (c).

Thus, to determine T, we need only find the value of m. Since this number m is of such
importance, we give it a formal definition.
Definition If
                                T(x) =m(x - c) + f (c)
is the best affine approximation to f at c, then we call m, the slope of the graph of T, the
derivative of f at c. This value is denoted by f'(c).
    With this notation, the best affine approximation has the form

                              T(x) = f'(c)(x - c) + f (c).                       (3.1.8)

Example     As a consequence of our previous example, if f (x) =_V/, then

                                              1
                                              2

Example     Let f(x) = x2 and suppose we wish to find the best affine approximation to
f at 3. Then f(3) = 9, so we will let

                                 T(xc) =m(cc - 3) + 9

and
                   R(h) = f(3 +h) - T(3 +h) =(3 +h)2 - (mh +9).
Hence
                     R(h) =9 + 6h + h2 - mh - 9 =h(6 + h - mn),


﻿


Section 3.1


Best Affine Approximations


9


20

15

10

5


2


4           6


5


Figure 3.1.5 Graph of f(x)


x2 and its tangent line at (3, 9)


and so
                limR(h) = lim h(6+ h-r) = lim(6-m+h)
                h-0  h      h-0       h         h-0
Thus R(h) is o(h) if and only if m = 6. It follows then that f'(3)
approximation to f at 3 is
                                  T(x) = 6(x - 3) + 9.


6 - m.

= 6 and the best affine


The equation of the tangent line at (3, 9) is

                                    y = 6(x - 3) + 9,

or, equivalently,
                                       y = 6x -9.

See Figure 3.1.5.
    In Sections 3.2 through 3.5 we will explore techniques which will simplify greatly the
process of finding derivatives.

Problems


1. Consider the problem of finding an affine approximation for f(x)
   Since f (0) = 0, we let T(x) = mx and


sin(x) near 0.


                            R(h) = f (h) - T(h) = sin(h) - mh.


   (a) Plot |R(h)| on the interval [-0.2, 0.2] for m = 0.0, 0.2, 0.4, ... , 2.0.
   (b) Which value of m gives the smallest errors?
2. For each of the following, decide if the given function is O(h), o(h), or neither.

   (a) f (x) = x3                              (b) f (x) = x2 + 3x


﻿


10


Best Affine Approximations


Section 3.1


   (c) g(t) = 4t3 - 3t2                     (d) g(x) = v/4 + x'- 4- 2
                                                                4
   (e) f (t) = t 3                          (f) g (t) = t- t 5
                         1 1
3. Let f (x) =V2, T(x) = -(x - 9) + 3, and S(x) = 6(x - 9) + 3.

   (a) Graph f, T, and S together. Note that the graphs of T and S are straight lines
      passing through the point (9, 3) on the graph of f.
   (b) Let RT(h) = f(9 + h) - T(9 + h). Is RT(h) o(h)? Is it O(h)?
   (c) Let Rs(h) = f(9 + h) - S(9 + h). Is Rs(h) o(h)? Is it O(h)?
   (d) Which of T or S is the best affine approximation to f at 9?
   (e) Use the best affine approximation to f at 9 to approximate  10, 8.9, and 9.3.
      Compare these approximations with values from your calculator.
4. Let g(z) = z2, T(z) = 2(z - 1) + 1, and S(z) = 3(z - 1) + 1.
   (a) Graph g, T, and S together. Note that the graphs of T and S are straight lines
      passing through the point (1, 1) on the graph of g.
   (b) Let RT(h) =g(1 + h) - T(1 + h). Is RT(h) o(h)? Is it O(h)?
   (c) Let Rs(h) =g(1 + h) - S(1 + h). Is Rs(h) o(h)? Is it O(h)?
   (d) Which of T or S is the best affine approximation to g at 1?
   (e) Use the best affine approximation to g at 1 to approximate (1.1)2 and (0.999)2.
      Compare these approximations with values from your calculator.
5. Find the best affine approximation to f(x) = 2x2 at 1. What is f'(1)?
                                              1
6. Find the best affine approximation to g(t) = - at 1. What is g'(1)?

7. Find the best affine approximation to f(t) =t2 + t - 1 at 0. What is f'(0)?


﻿


Section 3.2


                to                    Best Affine Approximations,
       Differential Equations         Derivatives, and Rates of Change


In this section we will take up the general question of how to find best affine approximations
and also discuss an interpretation of the derivative of a function as an instantaneous rate
of change. We will consider specific computational procedures for finding derivatives in
Sections 3.3 through 3.5.
    To begin, suppose f is a function defined on an open interval containing the point c
and let T be an affine function with T(c) = f(c). As in Section 3.1, we may write T in the
form
                               T(x) = m(x - c) + f (c)                        (3.2.1)

for some constant m. Let

                 R(h) = f (c + h) - T(c + h) = f (c + h) - mh - f (c).    (3.2.2)

Then
                       lim R(h)    lim f (c + h) - T(c + h)
                       h-o  h     h-O          h

                                   lim f(c+h)-mh-f(c)                         (3.2.3)
                                   h->O         h
                                = lim f(c+ h) - f(c)  m
                                  h-->0        h

Hence R(h) is o(h), and T is the best affine approximation to f at c, if and only if


                           lim   f(c+h)-f(c) - m      = 0,                    (3.2.4)
                           h-->0       h

which is true if and only if
                              lim f (c + h) - f (c)                           (3.2.5)
                              h->O       h
In particular, if
                                 lim f(c + h) - f(c)
                                 h->O      h
exists, then f has a best affine approximation at c and


                             f' (c) = lim f(c + h) - f(c)                     (3.2.6)
                                    h->O       h


1


Copyright Q by Dan Sloughter 2000


﻿


2       Best Affine Approximations, Derivatives, and Rates of Change


Section 3.2


Conversely, if T(x) = m(x - c) + f(c) is the best affine approximation to f at c, then it
follows that
                               m     ihm .f(c+h)- f(c)                            (3.2.7)
                                    h-0O       h
Definition We say a function f is differentiable at a point c if


                                  hmf(c+h)-f(c)                                  (3.2.8)
                                  h-0O       h
exists.
    In summary, if we are given a function f which is differentiable at c, then the best
affine approximation to f at c exists and is given by

                               T(x) = f'(c)(x - c) + f(c),                       (3.2.9)

where
                              f'(c)   lim f (c + h) - f (c)                     (3.2.10)
                                      h-0O       h
Conversely, if f has a best affine approximation at a point c, then f is differentiable at c
and the best affine approximation is given by (3.2.9).
Example     Consider the problem of finding the best affine approximation to f(x) = x2
at x = 1, a problem we first looked at in Section 1.1. We first need to find the derivative
f'(1). Using (3.2.10), we have

                                  .f (1)  f( + h)- f (1)
                                      h-0O       h
                                      .   (1+h)2-1
                                   = lhm
                                      h-O       h
                                      .   1+2h+h2-1
                                      = hmh
                                      h-0 h
                                      limh(2 + h)
                                      h-O     h
                                    = lim (2 + h)
                                      h-0
                                    = 2.

From (3.2.9), it now follows that the best affine approximation to f at 1 is

                             T(x) =2(x - 1) + 1 =2w - 1.

Furthermore, from our discussion in Section 3.1, the equation of the line tangent to the
graph of y =9x at the point (1, 1) is then

                                      y =2w - 1,


as shown in Figure 3.2.1.


﻿


Section 3.2


Best Affine Approximations, Derivatives, and Rates of Change


3


10


4       -2


2        4


                     Figure 3.2.1 Graphs of y = x2 and y = 2x - 1


    Frequently we will be interested in the derivative of a function not just at a single
point, but at many different points. Instead of performing the above calculation at each
point separately, we try to compute the derivative at an arbitrary point, after which we
can substitute in any desired point for evaluation. In fact, for any given function f, we
may define a new function f' by setting


f'(cc) =lim f (x + h)- f (x)
        h-0        h


(3.2.11)


for all points x at which the limit exists. This new function, f', is called the derivative of
f. Note that the domain of f' may be smaller than the domain of f. If the open interval
(a, b) is in the domain of f', we say f is differentiable on (a, b).


Example     Let f(x)   cV. From our work in the Section 3.1 we know that


                                               1
                                       f'(1) = -.
                                               2


Now we will find a general expression for f'(x) at an arbitrary point x in (0, oc). Using
(3.2.11), we have


﻿


4      Best Affine Approximations, Derivatives, and Rates of Change  Section 3.2


                       f(          f (x + h)- f (x)


                             = hm        h-x-
                               h-O       h

                             =lim      +h-    cc
                               h-0       h       /+h+      c/c
                               r      x+h-x
                               h-~Oh(xc+h+ cc-)
                             -r hm
                                          h
                               h-oh(/+h +   /c)
                                         1
                              =lim
                              h-O /+h+        /c
                                   1

                                 1


Hence f is differentiable on (0, oc). In particular, we once again have
                                             1
                                             2
Moreover, it is now straightforward to find the best affine approximation to f at any point
c > 0. For example,
                                              1
                                     f'(16) = -,
so the best affine approximation to f(x) = V/zat x = 16 is
                                   1              1
                           T(x) = -(x - 16)+4= -x+2.
                                  8               8
See Figure 3.2.2 for the graphs of f and T.
Example Now consider g(t) = t3. Then

                        g' (t) lim.g(t +h) -g(t)
                               h-0O       h

                               . lm(t + h)3 - t
                               hm~
                               h-0O            h
                               . lim h(3t2 + 3th + hat
                               h-0O          h
                               . hi(3t2+3th+h2)

                               h-0O
                             = 3t2.


﻿


Section 3.2


Best Affine Approximations, Derivatives, and Rates of Change


5


7

6

5

4

3

2

1


                            5      10      15     20     25     30

                 Figure 3.2.2 Graphs of f (x) =xVz and T(x) ='x + 2


Hence, for example, g'(-2) = 12, and the best affine approximation to g(t) = t3 at t
is


-2


                           T(t) = 12(t + 2).-
See Figure 3.2.3 for the graphs of g and T.


8= 12t + 16.


20


4


2        4


Figure 3.2.3 Graphs of g(t)


t3 and T(t)


12t + 16


Example     Suppose we wish to find the best affine approximation to f(x)
To find the derivative of f at 0, we need to consider the quotient


|x| at x =0.


f(0+h)-f(0) _ h
       h h


-h
h
h
h


   -1 if h<0,

1,     ifh>0.


Thus
                           lim f (0+h) - f (0)
                           hm-h


.1


﻿


6       Best Affine Approximations, Derivatives, and Rates of Change


Section 3.2


and
                                 .   f(O+h)-f(O)
                                 hm                  = 1
                               h-0A+        h
from which it follows that
                                   Sf (0 + h) - f (0)
                                   him
                                   h-o       h
does not exist. In other words, f is not differentiable at 0. Thus f does not have a best
affine approximation at 0. However, for x < 0, the graph of f is a straight line with slope
-1 and for x > 0, the graph of f is a straight line with slope 1. Thus

                                        1, ifx<0,
                                          1, if x> 0.

Hence the domain of f' is {x| x  0}, whereas the domain of f is the interval (-oc, oc).

    The previous example illustrates the fact that a function may be continuous at a point,
as f(x) =_Ix| is continuous at x = 0, without being differentiable at that point. However,
it turns out that if a function is differentiable at a point, then it must be continuous at
that point. To see this, note that if T is the best affine approximation to a function f at
c and r(x) = f(x) - T(x) is the remainder function, then

                                  f (x) = T(x) + r(x).                          (3.2.12)

Since T is a continuous function, lim r(x) = 0, and T(c) = f(c), we have
                                 x- c

      lim f (x) = lim (T(x) + r (x)) = lim T(x) + lim r (x) = T(c) + 0 = f (c), (3.2.13)
      x-c        xc                  xc         x-c

which is what it means for f to be continuous at c.
Proposition If f is differentiable at a point c, then f is continuous at c.

Leibniz notation and rates of change
    If y = f(x) with f(x) = mx+ b, then one unit change in x results in m units of change
in y. That is, for a straight line, the slope of the line is the rate of change of y with respect
to x. Moreover, since f is its own best affine approximation (and a straight line is its own
tangent line), we have f'(x) = m for all values of x. Hence, in this case, the derivative of
f gives the rate of change of y with respect to . What distinguishes this type of function
from other functions, and what makes the slope easily computable, is that this rate of
change is a constant. We will now use derivatives to give meaning to the rate of change of
an arbitrary function at a point where it is differentiable.
    If y - f(x), it is common to write Ax for an increment in x and Ay for the change in
y corresponding to a change in x of Ax. In our notation above, we would write


Ax=h


(3.2.14)


﻿


Section 3.2


Best Affine Approximations, Derivatives, and Rates of Change


7


and
                              Ay = f (x + Ax) - f (x).                      (3.2.15)

Thus we can write

                     f(x + h) - f(x) _ f(x+Ax)-f(x) _ Ay
                            hxx(3.2.16)

from which we have
                                 f'(x) = lim  Ay                            (3.2.17)

This type of notation, although not this type of reasoning, motivated Leibniz to denote
f'(x) by
                                  dy= lim    Ay                             (3.2.18)
                                  dx    Ax- OAx
If the derivative is to be evaluated at a point c, then we would write


                                  f'(c) =_dy    .(3.2.19)
                                          dx


Example     If y = x, then from our result above we may write

                                     dy     1
                                     dx- 2 x

and, for example,
                                dy         1     1
                                dx _9     2 9    6

    Now Ay represents the average rate of change of y over the interval from x to x + Ax.
That is, this ratio tells us how much y changes per unit change in x over the interval. As
we let Ax go to 0, this ratio will approach a limiting value, namely, the derivative , which
we may interpret as the instantaneous rate of change of y with respect to x. If the rate of
change of y with respect to x were not to change over an interval of length 1, then y would
change by an amount equal to 2 over that interval.
    As an example, if s =f(t) gives the position of an object moving in a straight line, then
  Ais the average rate of change of position of the object with respect to time, which we
       callits verae veocit. Thnt' the derivative of s with respect to t, is the instantaneous
rate of change of position with respect to time; that is, 9 is the instantaneous velocity, or,
    simlyveociyof the object. The difference between As ad ds is the difference between
finding the average speed for a trip in a car by dividing the total miles traveled by the
total time elapsed and finding the instantaneous speed at any one time during the trip by
looking at the car's speedometer.


﻿


8      Best Affine Approximations, Derivatives, and Rates of Change


Section 3.2


Example Galileo discovered that if an object is dropped from a initial height of 100
feet, then, ignoring the effects of air resistance, its height, in feet, above the ground after
t seconds would be
                                    s = 100 - 16t2.

For example, at time t = 1 the object would be at a height of

                              s t_1 = 100 - 16 = 84 feet

and at time t = 2 it would be at a height of

                              s t-2 =100 - 64 = 36 feet.

Hence the average velocity of the object over the time interval would be

                           As    36 - 84
                           At    - 21      -48 feet/second.

Note that the average velocity over this time interval is negative because we have taken
the positive direction to be up. The average speed of the object, which is the absolute
value of the velocity, would be 48 feet per second. To find the instantaneous velocity at
time t, we compute

                  ds _l      As
                     =-lim
                  dt   At- o At

                        Slim(100 - 16(t + At)2) - (100 - 16t2)
                     = h
                       At-o                 At

                       Slim100 - 16(t2 + 2tAt + (At)2) - 100 - 16t2
                     = h
                       At-o                    At
                         -   -32tAt - 16(At)2
                     = lhm
                       At-o         At
                     = lim (-32t - 16At)
                       At-o
                     - -32t.

Hence the instantaneous velocity of the object at time t = 1 is

                               ds
                                     t  -32 feet/second
                               dt

and the instantaneous velocity at time t 2 is


                                 cit ~2 -64 feet/second.


    Although Leibniz seems to have thought of the expression 2 as a ratio, we should think
of A~ as the operation of differentiation, which, when applied to y, yields the derivative of


﻿


Section 3.2       Best Affine Approximations, Derivatives, and Rates of Change  9

y with respect to x. In other words, we should not think of    as a ratio, but as Ax(y).
For example, if y = x3, then, using an earlier result from this section, we might write

                                  dy    d () = 3x2
                                  doc   docXX

    The "prime" notation for a derivative is due, not to Newton, but to Joseph Louis
Lagrange (1736-1813). Newton's notation, a dot above the dependent variable, represents
a derivative with respect to time, denoted by t. For example, in the previous example we
may write
                                           -32t

using Newton's notation. Because of its simplicity and the frequency with which derivatives
with respect to time occur, this is often a useful notation and we will make extensive use
of it when we study differential equations in Chapter 8.

Problems

1. Using (3.2.10), find the derivative of each of the following functions at the indicated
    point.
                                                          1
    (a) f (x) = x2 + 1 at x = 2                (b) f (t) = - at t=1
                                                          t
                1
    (c) g(x) =X2 at x = 2                      (d) h(t) = ft +I1 at t = 3
                1
    (e) f (s)=     at s = 1                    (f) g(z)   (z + 1)2 at z = -1

 2. For each of the functions in Problem 1, find the best affine approximation for the
    function at the indicated point. Also, find the equation of the tangent line at that
    point and graph the function and its tangent line together.
 3. Using (3.2.11), find the derivative of each of the following functions. Note any points
    where the given function is not differentiable.
                                                          1
    (a) f(x) = 2x2                             (b) g(x)=-
                1                                          1
    (c) f (t)                                  (d) h(z) =

    (e) y (t) = t2 +4t                         (f) g(s) = 2s3 _-3
 4. Using your result from part (c) of Problem 3, find the best affine approximation to

                                                1
                                             f   t)

                                      1
    at t =4. Use it to approximate    .
                                     3.98


﻿


10      Best Affine Approximations, Derivatives, and Rates of Change


Section 3.2


5. Let f(x) = ax2 + bw + c, where a, b, and c are constants. Show that f'(x)=
   Does this result agree with your results in parts (a) and (e) of Problems 3?
6. Use your result from Problem 5 to find the best affine approximation to

                                   f(x) = 3x2 - 2x + 5

   at x =-2.
7. Use your result from Problem 5 to find the best affine approximation to

                                   g(t) = -2t2 + 3t - 6


2ax + b.


at t= 3.


8. Suppose f is a function with the properties that f(0)


0 and


lim f(t)
two t


1.


Show that f'(0) = 1.


9. Suppose g
   g'(0) = 0.


is a function with the properties that g(0) = 0 and g is o(h). Show that


10. Suppose f is a function with the properties that

                                     f (s + t) = f(s)f(t)


for all numbers s and t and


li f (t) - 1
him      - =1.
two     t


Show that f'(t)


f(t).


11. Suppose f(x)


12. Suppose  g(t)


13. Suppose g(w)

    g'(1)-


=3x2,
  x ,


S5t,
:x2 -
  4x -


if x <0 . Is f differentiable at x = 0? If it is, find f'(0).
if w > 0

if t < 0
if t > 0 . Is g differentiable at t = 0? If it is, find g'(0).


2x + 2,
3,


if x < 1   Is g differentiable at x
if x> 1


1? If it is, find


14. For each of the following, find the derivative of the dependent variable with respect to
    the independent variable. Denote the derivative using Leibniz's notation.


(a) s = 2t3


(c) q


    2
s- -
    s


(b) z = 2vt

(d) t   X4


﻿


Section 3.2


Best Affine Approximations, Derivatives, and Rates of Change


11


15. FindJ(4x2) andJ(3vu--).

16. An object is thrown vertically into the air from an initial height of 100 meters above
    the ground with an initial velocity of 10 meters per second. If s represents the height,
    in meters, of the object above the ground after t seconds and we ignore the effects of
    air resistance, then
                                   s = 100 + 10t - 4.9t2.

    (a) What is the average velocity over the time interval [0, 2]? Over [0, 1]? Over [1, 2]?
    (b) Find the velocity v of the object at time t. You may use Problem 5.
    (c) What is the velocity after 1 second? After 2 seconds?
    (d) When is v= 0? Is v positive or negative before this time? Is v positive or negative
        after this time?
    (e) What is the height of the object when v= 0? What is the significance of this
        height?
    (f) The rate of change of velocity is called acceleration. Find the acceleration of the
        object; that is, find .
    (g) What is the significance of the fact that $ is a constant?
17. Find the rate of change of the area A of a circle with respect to its radius r.
18. Find the rate of change of the volume V of a sphere with respect to its radius r.
19. Find the rate of change of the area A of a square with respect to the length x of one
    of its sides.
20. Find the rate of change of the volume V of a cube with respect to the length x of one
    of its sides.


﻿


                                      Section 3.3
        Difference Equations
                to                    Differentiation of Polynomials
       Differential Equations         and Rational Functions


In this section we begin the task of discovering rules for differentiating various classes
of functions. By the end of Section 3.5 we will be able to differentiate any algebraic or
trigonometric function as a matter of routine without reference to the limits used in Section
3.2.


Differentiation of polynomials
We first note that if f is a first degree polynomial,
a and b, then f is an affine function and hence its
f'(x) = a for all x. In particular, if f is a constant
f'(x) = 0 for all x.
    Next we consider the case of a monomial f(x)
greater than 1. Then

                  f'(x) = lim  (x   h)-f(x)= 1
                          h-O        h          h


say, f(x)
own best
function,


= ax + b for some constants
affine approximation. Thus
say, f(x) = b for all x, then


x", where n is a positive integer


im (x + h)n - xn
h-->o    h


(3.3.1)


Now
                           (x + h)n = xn + nox-lh + R(h),                     (3.3.2)
where R(h) represents the remaining terms in the expansion. Since every term in R(h)
has a factor of h raised to a power greater than or equal to 2, it follows that R(h) is o(h).
Hence we have
                        f'(x)   lim x + nxn-1 h + R(h) - z"
                               h->O            h

                             = lim nxn-1h + R(h)
                               h->O       h


(im xn-


1 R(h))
     h


= nX1 + lim R(h)
           h->o h
= nxn-
f'(x) = 1 when f(x)
following proposition.


Since from our previous result
case n = 1. Hence we have the


x, this formula also works in the


Proposition For any positive integer n,


dx


(3.3.3)


1


Copyright @ by Dan Sloughter 2000


﻿


2


Differentiation of Polynomials and Rational Functions


Section 3.3


Example     If f(x) = x3, then f'(x) = 3x2, as we saw in an example in Section 3.2.
Example Similarly,
                                       d
                                       t = 5t4.

Hence, for example, the equation of the line tangent to the curve w = t5 at (-1, -1) is
                                   X = 5(t + 1) - 1,
or
                                      x = 5t +4.

    Once we establish results for the derivative of a constant times a function and for the
derivative of the sum of two functions, similar to the results we have for limits, we will be
able to easily differentiate any polynomial. So suppose f is a differentiable function and
let k(x) = cf(x), where c is any constant. Then

                            k'(w)   lim k(x + h) -k(x)
                                    h-O        h

                                    lim cf(x + h) - cf(x)
                                    h-O         h

                                    limc(f (x + h) - f (x))
                                    h-O          h
                                  .Clim f (x + h)- f (x)
                                     h-O         h
                                  = cf'(x).
That is, the derivative of a constant times a function is the constant times the derivative
of the function.
Proposition If f is differentiable and c is any constant, then

                                  (cfW(W) = c df (W.                             (3.3.4)
                                  do           do
Example If f (x) = 14x3, then
                               f'(x) - (14)(3x2) - 42x2.

    Now suppose f and g are both differentiable functions and let k(x) = f(x) + g(w).
Then
                   k'(w)  lim k(x + h) - k(x)
                          h-0O        h
                          -li (f (w +h) + g(w +h)) - (f (w) + g(w))
                          h-0O                  h
                        = hm    fwh~~)+

                           li f(w +h) -f(w) +.i g(w +h) -g(w)
                           h-0O       h          h-0O       h
                         =f'(w) + g'(w).


﻿


Section 3.3


Differentiation of Polynomials and Rational Functions


3


Hence the derivative of the sum of two functions is the sum of their derivatives. A similar
argument would show that the derivative of the difference of two functions is the difference
of their derivatives.

Proposition If f and g are both differentiable, then

                         ddd
                           (f(W) +g(x)) =df(w) + dg(x)                       (3.3.5)

and
                         ddd
                           (f(w) -g(x)) =    f (x) -  g(x).                  (3.3.6)

    Putting the preceding results together, we are now in a position to easily differentiate
any polynomial, as the next examples will illustrate.

Example Suppose f (x) = 3x5 - 6x2 + 2x - 16. Then

                            d
                    f'(x) d (3x5 - 6x2 + 2x - 16)
                           dx
                              (3j5) - d (6w2) + d(2x) -     (16)
                           dx        dx         dx       dx
                             d       d        d
                          -3 x 5 -6  x2 +2  x- 0

                          (3)(5x4) - (6)(2x) + (2)(1)
                          - 15x4 - 12x + 2.


Example Of course, it is not necessary to write out in detail all the steps in differentiat-
ing a polynomial as we did in the preceding example. For example, if g(t) = 3t12 - 6t2 +t,
then
                   g'(t)  (3)(12t11) - (6)(2t) + 1 = 36t" - 12t + 1.

In particular, since g(1) -2 and g'(1) = 25, the best affine approximation to g at t  1
is
                           T(t) =25(t - 1) - 2 =25t - 27.


Differentiation of rational functions
We next consider the problem of differentiating the quotient of two functions whose deriva-
tives are already known. In particular, combining this result with our result for polynomials
will enable us to easily differentiate any rational function. We might hope that, analogous
to the last two results and the related results for limits, the derivative of the quotient of
two functions would be equal to the quotient of their derivatives. This turns out not to be
true; nevertheless, there is a nice rule for differentiating quotients.


﻿


4


Differentiation of Polynomials and Rational Functions


Section 3.3


    Suppose f and g are both differentiable functions and let k(x) =fx) Then, at all
points where g(x) # 0,

                       k'(x)   lim k(x + h) - k(x)
                               h-O        h
                                   f (x + h) f (x)

                             = lim g(x + h)  g(x)
                               h-O         h
                                   g(x)f(x + h) - g(x + h)f(x)
                             =lim          g(x + h)g(x)
                               h-O               h
                             Slimg(x)f(x + h) - g(x + h)f(x)
                               h-O        hg(x)g(x + h)
It turns out that by adding and subtracting the term g(x)f (x) (a standard mathematical
trick of adding 0) in the numerator, we can simplify this limit into a form that we can
evaluate. That is,

            k'(x)   lim g(x)f (x + h) - g(x + h)f (x)
                    h-O        hg(x)g(x + h)

                    lim g(x)f (x + h) - g(x)f (x) + g(x)f (x) - g(x + h)f(x)
                    h-O                   hg(x)g(x + h)

                    limg(x)(f (x + h) - f (x)) - f (x)(g(x + h) - g(x))
                    h-O                 hg(x)g(x + h)

                        g(x) f(x+ h) - f(x)f(x) g(x + h) - g(x)
                 -1.                  h       f                 h       f
                    him0                   g(x)g(x + h)
Now

       . ~x      f (x + h) - f9(x)     . f (x + h) - f (x)  9).',(337
       lim g(x)                                      h ~)h      g(x)f'(x),       (3.3.7)


       limf(x) g(x + h) - g(x) _ f(x)limg(x + h) - g(x)  (x)g'(x),  (3.3.8)
       h-0O             h                 h--0l~     h        -fc~
and
               lim g(x)g(x + h) = g(x) lim g(x + h) = g(x)g(x)   (g(x))2,  (3.3.9)

where the limits in (3.3.7) and (3.3.8) follow from the differentiability of f and g, while
the limit in (3.3.9) follows from the continuity of g (which is a consequence of the differ-
entiability of g). Putting everything together, we have

                             k'(xc) =g(cc)f'(cc) - f(x)g'(x)                    (3.3.10)
                                            (g(x))2>


a result known as the quotient rule.


﻿


Section 3.3


Differentiation of Polynomials and Rational Functions


5


Quotient Rule If f and g are both differentiable, then


d f (x))
do g9(T)


g(x) d fW - f Wdg(x)
         (g(x))2


(3.3.11)


at all points where g(x) # 0.


Example


Suppose f(x)


2x+ 1
      . Then
 x-2


f/(x)


(x - 2)-(2


(x - 2)(2) -
        (I -
 2x - 4 - 2x
    (x -2)2
      5
  (x-2)2'

7 and f'(3) =


x + 1) - (2x + 1)  (x - 2)

   (x   2)2
(2x + 1)(1)
2)2
-1


Hence, for example, f(3)
graph of f at (3, 7) is


-5, so the equation of the line tangent to the


y - -5(x - 3) + 7,


or


y =-5x + 22.


Example Suppose g(z)


1
  . Then
z2


                  g'     z2 dz     (dz(z2)
                  g (z) =4
                                 z4
Note that we may write this result in the form

                                   d z-2 =_
                                   dz

which is consistent with our previous result


(z2)(0) - 2z
    z4


2
z3


2z--3


                                    d n      n-1
                                    dzz = tz

However, we derived the latter under the assumption that n was a positive integer. We
will now show that we can extend this result to the case of negative integer exponents.


﻿


6          Differentiation of Polynomials and Rational Functions    Section 3.3

    Suppose f(x) = xn, where n is a negative integer. Then, using the quotient rule and
the fact that -n> 0,
                                  d
                          f'(x) d X"
                                 dx


                                 '1d

                                        dx dx
                                           X-2n
                                 (cc12)(0) - (rtcc121)
                                         X-2n
                                  n-n-1
                                  X-2n
                                  =n-n-1+2n
                                  =1no-1


We can now state the more general result.

Proposition For any integer n # 0,


                                  dx
                                  x" = nz"-1.                             (3.3.12)

                        xd
                     1
Example If f(xc) =-, then

                           '()    d X-1    -X-2      2'
                                  d dx              x2

Example     Similarly,

                               d 5 d15
                      K-    3)    J(5x-3) = -15x-4 =1.
                      dx( x3     dx                      x4

    We will eventually see that (3.3.12) holds for rational and irrational exponents as well.
We will consider the rational case in Section 3.4, but we will not have the tools for handling
the irrational case until we discuss exponential and logarithm functions in Chapter 6.

Differentiation of products
We will close this section with a discussion of a rule for differentiating the product of two
functions. Since the product of two rational functions is again a rational function, this will
not extend the class of functions that we know how to differentiate routinely. However,
this rule will be very useful in the future and, even at the present point, can help simplify
some problems.


﻿


Section 3.3


Differentiation of Polynomials and Rational Functions


7


Suppose f and g are both differentiable and k(x) = f(x)g(x). Then


k'()    lk(x + h) -k(x)
        h-O        h


.n f(x + h)g(x + h) - f(x)g(x)
h-O              h


(3.3.13)


Adding and subtracting f(x + h)g(x) in the numerator (again, the mathematical trick of
adding 0 in a useful manner) will help simplify this limit. Namely,

        k'()lim f(x + h)g(x + h) - f(x + h)g(x) + f(x + h)g(x) - f(x)g(x)
                h-0O                            h
                lim f (x + h)(g(x + h) -g(x)) + g(x)(f (x + h) - f (x))
                h-O                         h


im (f(x
h 0


+gh) ((x+h)-g(x))
             h


g(x) ( f (x + h) f (x)
                      ) ) .
             h


Now


i f(+h) ((x+h) - g(x)
       hm f~x~h)


lim f(+h) lim g(xw+ h) - g(x)
h-0         h-0        h


f(x)g'(x) (3.3.14)


and


lim g(x) f (x + h) f (x)
                         J


g(X) lim f(xhf
    h-0         h


g(W)f'(W),


(3.3.15)


where, as with the derivation of the quotient rule, we have used the differentiability of f
and g as well as the continuity of f in evaluating the limits. Putting everything together,
we have


                            k'(x)   f (x)g'(x) + g(x)f'(x).
a result known as the product rule.


(3.3.16)


Product Rule


If f and g are both differentiable, then


dx


W   d g(x) + g(x) d
   dx            dx


(3.3.17)


Example If


f (x) = (x4 - 3x2 + 6x - 3)(6x3 + 2x + 5),


then


f'(cc) = (4 - 3x2 + 6x -3) d (6x3 + 2x +5) + (6x3 + 2x +5) d (cc4 -
        (x4 - 3x2 + 6x - 3)(18x2 + 2) + (6X3 + 2x + 5)(4x3 - 6x + 6).


3x2 + 6x - 3)


Of course, in this example, f is just a polynomial so we could also find f' by multiplying out
the two factors of f and differentiating the polynomial term by term as usual. However,


﻿


8


Differentiation of Polynomials and Rational Functions


Section 3.3


the product rule gives us a quicker route to the derivative. Although the result is not
simplified into the standard form of a polynomial, for most applications this form is just
as useful as any other.

    It is worth noting that although we can now differentiate any rational function in
theory, in practice our methods may not be the most useful. For example, the function


                                   f(W) =(X2 + 1)567

is a polynomial, and so we know how to differentiate it. However, at this point the only way
we could perform the differentiation would be to expand f(x) into standard polynomial
form and then differentiate term by term. In Section 3.4 we will learn how to handle this
problem more directly. At the same time we will extend the class of functions that we can
differentiate routinely to include all algebraic functions.

Problems

1. Find the derivative of each of the following functions.


(a) f (x)=x3 + 6x
(c) g(t) = 3t - 6t2
(e) f (t) =(3t -6)2


(b) g(x):
(d) y(t)
(f) f(w)


: 13x5 - 6x2 + 13
4t3 - 18t + 3
= (4x + 5)(6x2 - 1)


2. Find the derivative of each of the following functions.


(a) f (x) =(2x +1)2
            w-3
(c) g(w) =  +
   (C) gW 2x + 5

(e) f (t)3t4 - 8t + 1
             2t3 + 6
           3
(g) h(t) = -


(b) g(t) - (t2 - 3)3
           2s-s2
(d) h(s) =   2 + 1

(f) x(t)    3 - 16t2

            41
(h) f (x) 3w7
           31     1
   (J fs)         -s2


(i) h(z) = 8z3


1
2z


16s


3. For each of the following, make use of the product rule in finding the derivative of the
   dependent variable with respect to the independent variable.

   (a) s = (t2 - 6t + 3)(8t4 + 6t2 -7)
   (b) q = (13t4 + 5t) (3t5 + 4t3 + 16t - 31)
   (c) y = (x2 - 2x + 3)(2x2 + 13x - 6)(3x2 - 4x + 1)

   (d) z=(x2-3x+6)(8x2+3x-2)
                      x2-6


﻿


Section 3.3


Differentiation of Polynomials and Rational Functions


9


4. Suppose f (2) = -2, f'(2) = 6, g(2) = 3, and g'(2) = -4. Find k'(2) for each of the
   following.
                                                         f (c)
   (a) k(x) =f(x)g(x)                         (b) k(x)   g( )
                                                         f(cc) -fc~~
   (c) k c(x) = f(X)(g(X))2                   (d) k c(x)       gx) - f(x)g(x)
                                                              g(x)
                                                                                   t3
5. Suppose an object moves along the x-axis so that its position at time t is x = -t + .

   (a) Find the velocity, v(t) =i(t), of the object.
   (b) What is v(0)? What does this say about the direction of motion of the object at
       time t = 0?
   (c) When is the object at the origin? What is the velocity of the object when it is at
       the origin?
   (d) For what values of t is the object moving toward the right?
   (e) For what values of t is the object moving toward the left?
   (f) What is happening at the points where v(t) = 0?
   (g) Find the acceleration of the object, a(t) =i (t).
   (h) When is the acceleration positive? When is it negative?
   (i) Notice that v(1) <0 and a(1) > 0. What does this say about the motion at time
       t = 1?
                                                    d                   d
6. (a) Using only the product rule and the fact that  x = 1, show that  x2 = 2x.
                                                   dcc                 d
                                             d
   (b) Now use the product rule to show that x3  3c2
                                             dcc
   (c) Let n> 1 and suppose we know that

                                         d m      m-1
                                      do

       for all m < n. Use the product rule to show that


                                       dz


﻿


Section 3.4


               to                    Differentiation of Compositions
       Differential Equations        of Functions


In this section we will consider the relationship between the derivative of the composition
of two functions and the derivatives of the individual functions being composed. We shall
see that the resulting differentiation rule, known as the chain rule, will be useful in a
variety of situations in our later work. The following example will set the stage.


Example


Consider a spherical balloon which is being inflated so that its radius is in-


creasing at a rate of 2 centimeters per second. If we let r denote the radius of the balloon
in centimeters, t denote time in seconds, and V denote the volume of the balloon in cubic
centimeters, then we know that r = 2t and


                                         4
                                     V = -x7r .
                                         3
Moreover. we can see that. as a function of t.


     4
V = -w(2t)3
     3


32 7t
3


At time t = 5, the rate of change of the radius with respect to time is


dr5


2 centimeters per second,


the rate of change of the volume with respect to the radius is


dV
dT r:


47r2
     r=10


4007 centimeters per centimeter,


:10


and the rate of change of the volume with respect to time is


dV
dt t_5


327t2
      t=5


8007 cubic centimeters per second,


where dis evaluated at r


10 since this is the value of r when t


5. It follows that


dV
dt t_5


dV      d
   dr  r=0 dtt=5


That is, the overall rate of change of V with respect to t is the product of the rate of change
of V with respect to r and the rate of change of r with respect to t. This is an example


1


Copyright @ by Dan Sloughter 2000


﻿


2


Differentiation of Compositions of Functions


Section 3.4


of the chain rule. Viewed in this manner, the chain rule is saying that if V changes 4007
times as fast as r and r changes 2 times as fast as t, then V changes (4007)(2) = 8007
times as fast as t.
    Another interesting special case of the chain rule arises with the composition of two
affine functions. Specifically, if f(x) = ax + b and g(x) = cx + d, where a, b, c, and d are
all constants, then

             f o g(x) = f(g(x)) = f(cx+ d) = a(cx+ d) + b = acx + ad + b.

Thus the slope of graph of f o g is ac, the product of the slopes of the graphs of f and g.
In terms of derivatives, this says that

                            (f a g)'(x) = ac = f'(g(x))g'(x).

The chain rule says this relationship holds for all differentiable functions.
    For the general case, suppose g is differentiable at a point c and f is differentiable at
g(c). We wish to compute the value of the derivative of f o g at c. We have

       (f ag)'(c)   lim f o g(c + h) - f o g(c) -.lim f(g(c + h)) - f(g(c))  (3.4.1)
                    h-o           h              h-O           h

As with our demonstrations of the quotient and product rules, we need to manipulate
(3.4.1) into a form which allows us to evaluate the limit in terms of what we already know.
The trick that works this time is to multiply and divide by g(c + h) - g(c). However, we
must be aware of one possible complication: In order to divide by g(c + h) - g(c) we must
be assured that g(c + h) - g(c) # 0 for all h in some interval about 0. We will assume that
this is the case. If in fact this were not the case, then one can show that both (fog)'(c) = 0
and g'(c) = 0, giving us the desired result that

                               (f a g)'(c) = f'(g(c))g'(c).

With our assumption, we have

              (f a g)'(c) =    . f(g(c+h)) - f (g(c))  (g(c+h) - g(c)>     (342)
                          h-0      g(c + h) - g(c)   J)hK       h

Since g is differentiable at c, we have

                               li  g(c + h) - g(c)  gc.(343
                               h-0O      h

Since f is differentiable at g(c), if we let s =g(c + h) - g(c), then

           .f (g(c + h)) - f (g(c))  li  f (g(c) + s) - f (g(c)) -f(),(34)
         him                       =_m=f(~c)                                      344
         h-0O   g (c + h) - g (c)    s-0O          s


﻿


Section 3.4


Differentiation of Compositions of Functions


3


where we have used the continuity of g at c to ascertain that s goes to 0 as h goes to 0.
Putting (3.4.2), (3.4.3) and (3.4.4) together, we now have


(f o g)'(c) = f'(g(c))g'(C),


(3.4.5)


which is our desired result.
Chain Rule If f and g are differentiable, then


(f o g)'() = f'(g(W))g'(Wc).


(3.4.6)


Example Suppose h(x)
f (X) - F10 Now


and


(1 + x2)1o Then h(x)


f o g(x) where g(x)


1 + x2 and


g'(x) = 2x


f'(x) = 10x9,


so


h'(x) = (f o g)'(x) = f'(g(x))g'(x) = f'(1 + x2)(2x) = 10(1 + x2)9(2x) = 20x(1 + x2)9.

    Note that the preceding example is a particular case of the following general example.
If g is a differentiable function, n # 0 is an integer, and h(x) = (g(x)), then h(x) = f og(x)
where f (x) = x. Then we have
                                   f'(x) = nz"-1,


and so


That is,


h'(x) = (f a g)'(x) = f'(g(x))g'(x) = n(g(x)) ' -g'(x).


(g(x)") =n(g(x))"-lg'(x).


(3.4.7)


Example To illustrate the previous comments,


           dx(3x - 2)6 = 6(3x - 2)5 do (3x - 2) = 6(3x - 2)5(3) = 18(3x - 2)5.

Example For another illustration, if


3


f(c)


then


( (5)(3) (x 3 +4)-6d(c3+4)
               do


.15(x3 + 4)-6(3X2)


  45x2
(x3 +4)6


﻿


4


Differentiation of Compositions of Functions


Section 3.4


    If we translate the chain rule into the notation of Leibniz, we obtain a formulation like
that of the first example. Specifically, if we let y = f(x) and x = g(t), then

                dy     = (f a g)'(c) =f'(g(c))g'(c) = dy     dx              (3.4.8)
                dt t_                               dx xg, dt t=c

For short, we write
                                    dy    dydx                               (349)
                                    dt    dxdt
This formula is easy to remember, but at the same time care must be taken to remember
that if we want to evaluate  at t = c, then we must evaluated at x = g(c).
Example Suppose that for a certain city, when the population of the city is p, the total
amount of waste deposited in the city landfill every day is given by W = 5 ppounds per
day. Moreover, suppose that the population of the city is growing so that t years from now
the population will be
                          p = 100, 000(1 + 0.04t + 0.008t2).

To find the rate of change of W with respect to t five years from now, we note that
p = 140, 000 when t = 5 and then compute

                     dW               5                  5
                     dp p=140,000   2 p p=140,000   2 140, 000

and
                    dp     = 100, 000(0.04 + 0.016t) t-5 = 12, 000.
                    dt t=51,
Hence the rate of increase of the number of pounds of waste in the landfill after five years
is, in pounds per day per year,

     dW        dW          dp              5                   30,000
     dt t-5     dp p-140,000 dt t-_   2 140, 000J              140, 000

where the final answer is rounded to 2 decimal places.

Differentiation of algebraic functions
At this point the only thing keeping us from routinely differentiating any algebraic function
is that we do not have a rule for handling exponents which are rational numbers, but not
integers. We now consider this problem. Suppose y =x"h, where nt - | for nonzero integers
p and q. Then
                                 yq    (x~) = z.                            (3.4.10)

Differentiating the left-hand side of (3.4.9) with respect to x gives us

                                   yd q   qy ,                              (3.4.11)
                                   do~~q       do'


﻿


Section 3.4               Differentiation of Compositions of Functions            5

where the factor    is a consequence of the special case of the chain rule in (3.4.7). Of
course,
                                    dxP = pxP-1.                             (3.4.12)
                                    do
We may equate (3.4.11) and (3.4.12) (by (3.4.10) they are the derivatives of equal functions)
to obtain
                                 qy-idY     pp-1.(3.4.13)
                                       dx
Solving for  , we have
                              d yp          p1 p-i-                          (3.4.14)
                              dx qy-        q

Recalling that y = x and n =   , (3.4.14) becomes

                dy - -P- (x)       -   ? PxP- x--P -px-1 -nxn-i.             (3.4.15)
                dx    q                q             q

Hence we may now state the following proposition as an extension of our previous results.

Proposition    If n # 0 is a rational number, then


                                    x" = nz"-1.                              (3.4.16)


Example We have
                            d       d1      1         1
                            dxx     dxX2 - X 2      2\/'
in agreement with our result in Section 3.2.

Example If
                                             3


then
              f'(x)d 3(x2 + 1) 2 --(x2 + 1) 2 (2x) = -            3   .
                  jx-dxx                2-(                  (x2 +1)s


Implicit differentiation
The technique used in the demonstration of the last proposition is of general use. Any
equation involving two variables, such as f(x, y) =0, determines a curve in the plane
consisting of the set of all ordered pairs (x, y) which satisfy the equation. Such a curve
need not be the graph of a function. For example, the curve associated with x2+y2-25 =0,
or, more simply, x2 + y2 = 25, is a circle of radius 5 centered at the origin, which is not
the graph of any function. However, for a specified point on the curve, it may be the case
that a segment of the curve containing that point is the graph of some function; hence the


﻿


6


Differentiation of Compositions of Functions


Section 3.4


10


Figure 3.4.1 Tangent line to the circle x2 + y2


25 at (3, 4)


curve may have a tangent line at this point. For example, (3, 4) is a point on the curve
x2 + y2 = 25 which lies on the half of the circle lying above the x-axis and, considered by
itself, this piece of the circle is the graph of a function, namely, the function y =/25 - x2.
To find the slope of the tangent line at such a point on the curve, we may borrow the
technique we used in demonstrating the previous proposition. That is, we differentiate
both sides of the equation, treating one variable as a function of the other. If we treat y as
a function of x, then, differentiating with respect to x and using the chain rule, we obtain
an equation involving d which we can then solve for d=. For the equation x2 + y2 = 25,
                      dx                             dx
we have


J(X2 + y2)
do


d
   25.
do


Since


J(X2 + y2)
do


2x + 2y
     =dc

=0,


and


d
  25


we have


2c+2dy
      2doc


Solving for d, we have


0.


c
y


dy
dcc


2x
2y


at all points (x, y) for which y # 0. Now we have
                                  dy


3


﻿


Section 3.4


Differentiation of Compositions of Functions


7


and so the equation of the tangent line at (3, 4) is

                                       3
                                 y3 (x- 3) + 4.

The circle with equation x2+ y2 = 25 and the tangent line at (3, 4) are shown in Figure
3.4.1. Note that our procedure would not work to find the tangent lines to the circle at
(-5, 0) and (5, 0). However, the tangents lines at these points are vertical, and, hence, do
not have a slope. Although it is beyond the scope of this book to provide a justification,
it is in fact the case that the technique outlined in this example will work to find the slope
of the tangent line at all points on the curve that have a tangent line with a slope.
    This technique for finding derivatives is called implicit differentiation because we did
not use an explicit formula for y in terms x. In this case we could have obtained the same
result by first solving for y in terms of x for values close to (3, 4), giving us y = v/25 -X2,
and then evaluating the derivative of this function at x = 3. However, this is not always
possible or desirable; in many cases implicit differentiation is significantly simpler even if
an explicit solution is possible.
Example Consider the problem of finding the best affine approximation to the curve
with equation
                               y3 + 3xy2 - wy + x = 7
near the point (2, 1). To find , we compute


                            dx(y3+3xy2 - Xy +X) =       7,
                            dx d
which give us

                d         d          d     (dy       d        d
                              ya +3x  2 + y2  - x + y z  +      x = 0.
                dx   +3dx      +3   dxx              dx      dx

Computing the derivatives on the left-hand side gives us

                    dy       (2dy )dy
                 3y +3x        2y     +3y2(1) -x     - y(1) + 1 = 0.
                     dx      Kdx!                 dx

Hence
                         2dy       dy      2    dy
                      3y2    + 6xy d  + 3y2 - x    -x y + 1 =0,
                         dYdx                    dx       Y
from which it follows that


                                (32+6xy - x) =y -3y_

Solving for 2, we have
                                 dy _y -3y2-_1
                                 dx 3y2 + 6xy - z


﻿


8


Differentiation of Compositions of Functions


Section 3.4


-4    -2


2


4


                                -6


   Figure 3.4.2 Curve with equation y3 + 3xy2 - wy + x = 7 and tangent line at (2, 1)


which holds at all points for which the denominator is not 0. Thus

                          dy              1-3-1          3
                          dx (xy)=(2,1)   3 + 12 - 2  13

So the best affine approximation at (2, 1) is given by

                                         3
                                         13

The equation in this example does not specify y as a function of x (in fact, in Figure 3.4.2
we can see that there are at least two other values of y that correspond to x = 2), but
there is a segment of the curve through (2, 1) which is the graph of some function. For this
function, which we have not explicitly found, T is the best affine approximation at x = 2.
For example, if we denote this unknown function by h, we know that

                                            3
                     h(2.05)   T(2.05)   -    (0.05) + 1 = 0.9885,
                                           13

where we have rounded the result to four decimal places. Put another way, the point
(2.05, 0.9885) is an approximate solution to the equation


y3 + 3xy2 - xy + x = 7.


﻿


Section 3.4


Differentiation of Compositions of Functions


9


    At this point we can routinely find the derivative of any algebraic function. In the
next section we will consider the derivatives of the trigonometric functions.

Problems

1. Find the derivative of each of the following functions.


(a) f(x)=(4x+5)4


(c) h(t):


    3
2(6t - 2)2


(b) g(x) = 13x(x2 + 2)5
             3s-4
(d) f(s)     3 + 2

(f) f() _(3x + 4)3(8x - 13)4
                 (2x + 3)


(e) g(z) - (3z + 4)3(2z2 + z)2


2. For each of the following, find the derivative of the dependent variable with respect to
   the independent variable.


(a) s = 4t2(t2 _ 1)2

(c) q = 83t3 - 4t2

(e) x = 8t (4t + 5)--2
(g) y = (3x - 1) 5


          -s2(4s -3)2
(b) z=-
            s2+ 1
            3x
(d) y = 3x+4

(f) u = 3(v2 +4)--I
(h) v = 9/32 + (3-u- 2)2


3. Find the best affine approximation to the function

                                               3x
                                    f(w)W 3
                                            (w2 + 1)2


at x =2.


4. (a) Find the best affine approximation to f(x) =_(1
       constant.
   (b) Use your result from (a) to approximate  1.06
       tamed from a calculator.


+ x)h at x =0, where h # 0 is a

and compare with the value ob-


   (c) Use your result from (a) to approximate 3 1.06 and compare with the value ob-
       tained from a calculator.
   (d) Use your result from (a) to approximate 5 1.06 and compare with the value ob-
       tained from a calculator.
5. Find the equation of the line tangent to each of the following curves at the indicated
   point.


(a) x2 + 3y2 = 21 at (3, 2)
(c) x2 + 3xy + y2 = 11 at (2, 1)
(e) x 5 +zy+y5 = 3 at (1, 1)


(b) x2 - 3y2 = 4 at (4, 2)
(d) y5 + 2x2y2 - X2 = 10 at (3, 1)
(f) 4x2 - 3xy - 2xy2 = 26 at (-2, 1)


﻿


10


Differentiation of Compositions of Functions


Section 3.4


6. Suppose values for f(x), f'(x), g(x), and g'(x) are as given in the following table.
             x           f(x)         f'(x)       g(x)         g'(x)
             0             1           2            2           3
             1             2          -1            0           2
             2             0           3            1          -2
    Find k' (0) for each of the following.
    (a) k(x) = f o g(x)                         (b) k(x) = g o f(x)
    (c) k(x) = g o g(x)                         (d) k(x) = f o f (x)
    (e) k(x) = (f o g) o f (x)                  (f) k(x) = g(f (x))f (x)
 7. Show that if g'(c) = 0, then (g o g)'(c) = 0.
 8. Suppose the sides of cube are increasing at a rate of 3 centimeters per minute. At
    what rate is the volume of the cube increasing when the length of one of the sides is
    10 centimeters?
 9. A pebble is dropped in a pond of water. Suppose that the resulting circular wave has
    a radius given by r = 200 tcentimeters after t seconds. Find the rate of change of the
    area of the wave with respect to time after 5 seconds.
10. The volume of a balloon is increasing at a rate of 50 cubic centimeters per second. At
    what rate is the radius increasing when the radius is 10 centimeters?
11. The kinetic energy of an object moving in a straight line is given by

                                              1
                                        K =-rmv2
                                             2'

    where m is the mass of the object and v is its velocity. If the acceleration of the object,
    a    d= , is a constant 9.8 meters per second per second, find dK when v= 10 meters
    per second.
12. Ship A passes a buoy at 10:00 a.m. and heads north at 20 miles per hour. Ship B
    passes the same buoy at 11:00 a.m. and heads east at 25 miles per hour. If s is the
    distance between the ships, what is j at noon?
13. Suppose the height of a rectangle is growing at a rate of 0.1 inches per second while its
    length is growing at a rate of 0.2 inches per second. When the height of the rectangle is
    4 inches and its length is 8 inches, at what rate is the area of the rectangle increasing?
14. The work force of a certain factory is growing at rate of 2 per month while the average
    productivity of a worker is growing at a rate of 4 units per month. If the work force
    is currently 100 and the average productivity per month is 200 units, at what rate is
    the total productivity per month of the factory increasing?

15. (a) What happens in Problem 14 if the work force is declining by 2 per month?
    (b) What happens in Problem 14 if the average productivity is decreasing by 5 per
        month?


﻿


Section 3.4


Differentiation of Compositions of Functions


11


16. A circular oil slick is 0.03 feet thick and has a radius which is increasing at a rate of 2
    feet per hour. When the radius is 100 feet, at what rate is the volume of the oil slick
    increasing?
17. Oil is being added to a circular oil slick at the rate of 100 cubic feet per minute. If the
    oil slick is 0.05 feet thick, at what rate is the radius of the oil slick increasing when the
    radius is 400 feet?
18. In Section 2.2 we mentioned that the period of a pendulum of length b centimeters
    undergoing small oscillations is given by

                                              b
                                    T =2r - seconds,


    where g = 980 centimeters per second per second. Suppose the length of the pendulum
    changes as a function of temperature T so that

                          db
                             = 0.08 centimeters per degree Celsius.
                          d7

    (a) Find dT when b = 20 centimeters.
    (b) Use (a) to approximate the effect on T of a 1°C increase in temperature. Do the
        same for a 2°C increase and a 2°C decrease.


﻿


                                      Section 3.5
        Difference Equations
                to                    Differentiation of Trigonometric
       Differential Equations      Functions


We now take up the question of differentiating the trigonometric functions. We will start
with the sine function. From Section 3.2, we know that


d               sin(x + h) - sin(x)
d sin(x) = l            h


(3.5.1)


From the addition formula for sine we have


sin(x + h)


sin(x) cos(h) + sin(h) cos(x),


(3.5.2)


and so (3.5.1) becomes


d.
  dsin(x) = lim
dx         h-o


sin(x) cos(h) + sin(h) cos(x) - sin(x)
                 h


(3.5.3)


Now


sin(x) cos(h) + sin(h) cos(x) - sin(x)
                 h


sin(x)(cos(h) - 1) + cos(x) sin(h)
               h


        cos(h) - 1)
sin(x)      h


+ cos(x) sin(h)


)


Thus


d.
dd sin(x)
dx


sin(x) lim cos(h) - 1 cos(X)
      h-*o    h


lim sin(h)
h-*o  h


(3.5.4)


Our problem then comes down to evaluating the two limits in (3.5.4). The second of these
turns out to be the key, so we will begin with it.


For 0 < h < 2, consider the point C


(cos(h), sin(h)) on the unit circle centered at


the origin. We first repeat an argument from Section 2.4 to show that sin(h) < h: If we
let A = (0, 0) and B = (1, 0), as in Figure 3.5.1, then the area of AABC is

                                       1
                                       - sin(h).
                                       2

The area of the sector of the circle cut off by the arc from B to C is the fraction A of the
area of the entire circle; hence, this area is


h      h


-7r
27


2


1


Copyright @ by Dan Sloughter 2000


﻿


2


Differentiation of Trigonometric Functions


         (0, 1)


Section 3.5


                                      Figure 3.5.1


Since AABC is contained in this section, we have

                                      1.        h
                                      -sin(h) <-,

or simply


(3.5.5)


                                      sin(h) <h.                                 (3.5.6)

    Now let D = (1, tan(h)), the point where the line passing through A and C intersects
the line perpendicular to the x-axis passing through B. Then AABD has area

                                  1           sin(h)
                                  - tan(h)=.
                                  2          2 cos(h)

Since AABD contains the sector of the circle considered above, we have

                                     h     sin(h)
                                     - <                                          (3.5.7)
                                     2    2 cos(h)'

or
                                           sin(h)
                                      h <        .                               (3.5.8)
                                           cos(h)


Putting inequalities (3.5.6) and (3.5.8) together gives us

                                               sin(h)
                                  sin(h) <h < .s (h)
                                               cos(h)


(3.5.9)


Dividing through by sin(h) yields


  h         1
sin(h)   cos(h)'


(3.5.10)


﻿


Section 3.5


Differentiation of Trigonometric Functions


3


which, after taking reciprocals, gives us

                                       sin(h)
                                   1>     h    > cos(h).

Now, finally, we can see where all of this has been heading. Since

                                      lim cos(h) = 1,
                                      h(0+

(3.5.11) implies that we must have


(3.5.11)


lim sin(h)
h-O+ h


1.


(3.5.12)


To check the limit from the other side, we make use of the identity sin(
Letting t = -h, we have


-x) _ - sinW .


lim sin(h)
h-0-    h


lim -sin(h)
h--      -h


lim sin(-h)
h--     -h


lim sin(t)
t-o+    t


1.


(3.5.13)


Together (3.5.12) and (3.5.13) give us the following proposition.

Proposition
                                          sin(h)
                                      lim         =1.
                                      h-O   h

    With this result, we may now compute


(3.5.14)


lim 1 - cos(h)
h-O      h


lim C 1 - cos(h)1
h--- o     h


1 + cos(h)
  1 + cos(h)


i 1 - cos2(h)
him
h-O h(1 + cos(h))
li      sin2(h)
him
h-O h(1 + cos(h))

lim  sin(h) lim   sin(h)

h-O    h   h-O 1 + cos(h)

(1)       = 0.


Proposition


lim 1 - cos(h)                                  (.515
h-0O h


(3.5.15)


﻿


4


Differentiation of Trigonometric Functions


Section 3.5


Of course, from (3.5.15) we have


    cos(h) - 1
h -0     h


     1 - cos(h)
h-0      h


0.


(3.5.16)


Putting (3.5.14) and (3.5.16) into (3.5.4) gives us

                   d sin(x) = sin(x) lim cos(h) - 1 + cos(x) lirnsin(h)
                   dx                h-O      h              h-O    h
                            = sin(x)(0) + cos(x)(1)
                            = cos(x).


Proposition


The function f(x)


sin(x) is differentiable for all x in (-oc, oc) with


d
d sin(x) = cos(x).
dx


(3.5.17)


    The derivatives of the other trigonometric functions now follow with the help of some
basic identities. Since cos(x) = sin(x + 2) and cos(x + 2)   - sin(x), it follows that


d
d cos(x)
dx


   sin (x+ -
dx k2/


cos (x +)d x +)
            2 dx 2


cos x + -
        +2


sin(x).


The other four derivatives are as follows:


d
d tan(x)
dx


d sin(x)
dx   cos(x)


       d
cos(x)    sin(
      dx


            d
(x) - sin(x) dxcos(x)


              cos2(x)
cos(x) cos(x) - sin(x)(- sin(x))
            cos2(x)
cos2(x) + sin2(x)
     cos2(x)
   1
cos2(x)
sec2(x),


d cos(x)
dx (sin(x)
       d                  d
sin(x) dxcos(x) - cos(x) dxsin(x)
      dx                 dx
              sin2(x)


d
dxcot(x)


﻿


Section 3.5


Differentiation of Trigonometric Functions


5


sin(x)(- sin(x)) - cos(x) cos(x)
            sin2(x)
-(sin2(x) + cos2(x))
       sin2(c)
     1
  sin2(c)
- csc2 (),


d
dc se(cc)


d
do (Cos (X))1
-(cos(X))-2 d cos(X)
sin(x)
cos2(x)
(    1  )    sin(x)
  Cos (X)(cos(x))
sec(x) tan(x),


and


d C


d


                            dcc sc(x) =dcc (sin(x))-
                                        -(sin(x))-2 dcsin(x)
                                           cos(x)
                                           sin2(x)
                                           (i1         cos(x)
                                           sin(x) sin(x)
                                        csc(cx) cot (X).
The next proposition summarizes these results.
Proposition The derivatives of the trigonometric functions are as follows:


d sin(cc)
d
d  cos(x)
d tan(x)
dx
d
d  cot(x)
dx

d sec(x)

d csc(x)
dx


cos(x)


- sin(x)

sec2(x)

- csc2 (c)


(3.5.18)

(3.5.19)

(3.5.20)

(3.5.21)

(3.5.22)

(3.5.23)


sec(x) tan(x)

- csc(x) cot (x)


﻿


6               Differentiation of Trigonometric Functions             Section 3.5

Example Using the chain rule, we have

                         sin(2x) = cos(2x)J(2x) = 2 cos(2x).

Example Using the product rule followed by the chain rule, we have
      _                             d                   d
        (3 sin(5x) cos(4x)) = 3 sin(5x)  cos(4x) + 3 cos(4x)  sin(5x)
      dx                           dx                   dx
                         = 3 sin(5x)(- sin(4x) d (4x)) + 3 cos(4x) cos(5x) d (5x)
                           -12 sin(5x) sin(4x) + 15 cos(4x) cos(5x).

Example Using the chain rule twice, we have

                         dsin2(3x) = 2 sin(3x)J sin(3x)
                               dxd
                                  = 2 sin(3x) cos(3x) d (3x)

                                  = 6 sin(3x) cos(3x).

Example Using the product rule followed by the chain rule, we have

                    dt (t2 tan(2t)) = t2+dt tan(2t) +tt2

                                 = t2 sec2(2t)5dt(2t) + 2t tan(2t)
                                 = 2t2 sec2(2t) + 2ttan(2t).

Example Using the chain rule twice, we have

                     sec3(3z) = 3 sec2(3z)J sec(3z)

                               = 3 sec2 (3z) sec(3z) tan(3z)dz (3z)

                               =9 sec3(3z) tan(3z).

Example If f (x) =8 cot4(3x2), then

                      f(x) =32 cot3(3x2) cot(3x2)
                                        dx
                           = 32 cot3(3x2)(- csc2(3x2)dx(3x2))
                           --192w cot3 (3w2) csc2(3w2).


﻿


Section 3.5


Differentiation of Trigonometric Functions


7


                                          2
                                          1.5
                                          1
                                          0.5

                    -4         -2                      2          4
                                         -.5


                                         -1.5
                                         -2
                      Figure 3.5.2 Graphs of y = sin(x) and y = x


Example     If f(x) = sin(x), then f(0) = sin(O) = 0 and f'(0) = cos(0) = 1. Hence the
best affine approximation to f (x) = sin(x) at x = 0 is

                                       T(x) = x.

This says that for small values of x, sin(x)  x This fact is very useful in many applications
where an equation cannot be solved exactly because of the presence of a sine term, but can
be solved exactly once the approximation sin(x)   x is made. For example, the formula
mentioned in Section 2.2 for the motion of a pendulum undergoing small oscillations was
derived after making this approximation. Without this approximation the underlying
equation cannot be solved exactly. See Figure 3.5.2 for the graphs of y = sin(x) and y = x.

Final comments on rules of differentiation
With the work of the last three sections we can now routinely differentiate any algebraic
function or any combination of an algebraic function with a trigonometric function. In
fact, the rules of these last three sections provide algorithms for differentiation which
may be incorporated into computer programs. Programs that are capable of performing
differentiation in this manner, as well as other types of algebraic procedures, are called
symbolic manipulation programs or computer algebra systems. These programs are very
useful when working with procedures that require exact knowledge of the formula for the
derivative of a given function.
    Contrasted to symbolic differentiation is numerical differentiation. Numerical differen-
tiation is performed when we approximate the derivative of a function at a specific point.
That is, whereas symbolic differentiation finds a formula for the derivative of a function,
which may then be evaluated at any point in its domain to find specific values, numerical
differentiation finds a single number which is used as an approximation to the value of the
derivative at one given point. For example, if we wish to approximate the derivative of a
function f at a point c, we might pick a small value of h, positive or negative, and compute

                                    f() f (c + h) - f (c)                      (..4
                                              h


﻿


8


Differentiation of Trigonometric Functions


Section 3.5


Of course, we need some procedure for deciding when h is small enough for (3.5.24) to give
an accurate estimate for f'(c). One technique is to use (3.5.24) repeatedly, cutting h in half
each time, until the result does not change through the desired number of decimal places.
This method is subject to serious roundoff errors due to the loss of significant digits in the
numerator when two nearly equal numbers are subtracted (see Problem 13). Hence the
numerical approximation of derivatives is not recommended unless it cannot be avoided.
Problem 10 suggests an alternative to (3.5.24) which is both more stable for computations
and more accurate for a given value of h.

Problems

1. Find the derivative of each of the following functions.


(a) f (x) =x2 sin(x)
(c) g(t) = 3tcos(2t)
(e) f (t) = sin(3t) cos(4t)


(b) g(x):
(d) h(s)
(f) g(z)


cos(4x)
sin2(s) cos(s)
sin3 (4z)


2. Find the derivative of the dependent variable with respect to the independent variable
   for each of the following.


(a) y   sin(2x)
           x
(c) x= sin(4t2 + 1)


1


col


(e) z -
        cos(2t)
(g) y = x2 csc(2x)


3. Evaluate each of the following.

   (a) d (sin2 (2x) cos2(3x))

   (c) Jqsec3(q2)

   (e)      1 + sin2(z)
       dz


(b) x = 3 tan(2t)
(d) y = 4Otan(2 -1)

(f) q = sec3(3t)

(h) s = 3t cot(2t)


(b)    (sec(x) tan(x))

     d (sin2(t)
     d t cos(t)
     d2
(f) (r cos(3r2))


4. Find the best affine approximation to f(x) = tan(2x) at 0.
5. Find the best affine approximation to g(t)  cos(t) at 0.
6. Find the best affine approximation to f(t)  sin2(t) at 0.
7. (a) Find the best affine approximation S to f(x) =  /1+ x at 0.
   (b) Find the best affine approximation T to g(x) = sin(4x) at 0.
   (c) Find the best affine approximation U to h(x) =   /1 + sin(4x) at 0.
   (d) What is the relationship between f, g, and h? Is their a similar relationship
       between S, T, and U?


﻿


Section 3.5


Differentiation of Trigonometric Functions


9


8. Evaluate the following limits.

   (a) lim sin(2x)
       x-0 x

   (c)ltan(x)
     ()lim
          x  x m

   (e) lim sin2 (X)
       x-0    x
       .   sin2(3t)
   (g)i hm
       t-0 t2


    ()i sin(2x)
    x-0 sin(3x)
        tan(2h)
(d) lim
    h-O sin(3h)

(f) lim 1 -cos(t)
    two     t2

(h) lim tan2(50)
    -go sin2(30)


9. For each of the following, decide whether or not the given function is o(h) and whether
   or not it is 0(h).


(a) f(x) = sin(x)
(c) g(t) = tan(t)
(e) f (t) = 1 - cos(t)


(b) f (x) =sin2(x)
(d) h(t) = tan2(t)
(f) g(t) = 1 - cos2(t)


10. Given a function f which is differentiable at the point c, define

                                  D~h) -f(c+h) - f(c)
                                                 h

    Then, for small values of h, f'(c) D(h).
    (a) Let h > 0. A better approximation for f'(c) than D(h) is given by averaging D(h)
        and D(-h). Show that if we define

                                   D (h) -D(h) + D(-h)
                                                   2

        then
                                        D f(c + h) - f (c - h)
                                                   2h

        What is D1(h) geometrically?
    (b) Let h > 0. Another approximation that is sometimes used for f'(c) is


D2(h) = -D1
         3


(h
K2}


1
-D1(h).
3


Show that


D2(h) - f(c


h) - 8f (c - 2) + 8f (c + 2)


f(c+h)


6h


﻿


10


Differentiation of Trigonometric Functions


Section 3.5


11. Using h = 0.00001, approximate the derivatives of the following functions using D(h),
    D1(h), and D2(h) (from Problem 10) at the indicated points. Compare your answers
    with the exact values.
                                                          1
    (a) f (x) = x2 at x = 2                   (b) f (x) =- at x = 2
    (c) f (x) = sin(x) at x = 0               (d) f (x) = 3 sin(x2) cos(4x) at x = 0
12. Compute D(h), D1(h), and D2(h) (from Problem 10) for the function f(x)  x I|at
    x = 0. Use h = 0.001. Are your answers reasonable? Can you explain them?
13. For f (x) = x2 and c = 2, compute the values of

                                    en = 4 - D(10--)|

    (see Problem 10) for n = 1, 2, ... ,15. Note that you are computing the absolute value
    of the error in approximating f'(c) by D(h) for different values of h. Plot the ordered
    pairs (n, en). Does the absolute value of the error decrease as h decreases? Can you
    explain your results?


﻿


Section 3.6


       Differential Equations         Newton's Method


Many problems in mathematics involve, at some point or another, solving an equation
for an unknown quantity. An equation of the form f(x) = 0 may be solved for x by
simple algebra if f is an affine function and by the quadratic formula if f is a quadratic
polynomial. There are formulas similar to the quadratic formula for both cubic and quartic
polynomials, but they are, in general, very cumbersome. One of the most interesting
results of mathematics, due to Niels Henrik Abel (1802-1829), is that there does not exist
an analogue of the quadratic formula for quintic polynomials. For this and other reasons,
it turns out that in many situations solving an equation f(x) = 0 for x requires using a
method which can approximate the solutions to a predetermined level of accuracy.
    In Section 2.5 we discussed one such method, the bisection algorithm, for approximat-
ing the solutions of an equation. The strong point of the bisection algorithm is that, once
an appropriate starting interval has been found, the method will always find a solution to
any desired level of accuracy; its weakness lies in the slowness with which the successive
approximations approach the solution. In this section we will discuss another method,
known as Newton's method, for approximating solutions to an equation. In distinction to
the bisection algorithm, Newton's method does not always work, but, when it does, it is
in general remarkably fast.
    Suppose we wish to find a solution to the equation f(x) = 0 for a given function f.
Recall that, geometrically, this corresponds to finding the point where the curve y = f(x)
crosses the x-axis. To start Newton's method, we must first have an initial guess x0.
Frequently, we find the initial guess by graphing the curve y = f(x) and letting x0 be a
point close to where the curve crosses the x-axis. Given the initial guess xo, let To be the
best affine approximation to f at x0. That is, define To by

                           To (x) = f'(xo)(x - xo) + f (xo).                  (3.6.1)

The idea behind Newton's method is to obtain an improved estimate of a solution to
f(x) = 0 by replacing the equation f(x) = 0 with the simpler equation To(x) = 0. If we
let x1 denote the solution to the latter equation, then we have To(xi) = 0, that is

                               f'(xo)(xi - xo) + f (xo),                      (3.6.2)

from which it follows that
                                  1 =o -     (x0)                             (3.6.3)

Geometrically, x1 is the point at which the line tangent to f at (xo, f(xo) crosses the x-
axis, as shown in Figure 3.6.1. To improve upon this approximation, we solve the equation


1


Copyright @ by Dan Sloughter 2000


﻿


2


Newton's Method


Section 3.6


x0


x


                   Figure 3.6.1 Two iterations of Newton's method


T1 (x) = 0, where T1 is the best affine approximation to f at xi. If we let x1 denote the
solution to this equation, then

                            f'(xi)(x2 - xi) + f (xi) = 0,                    (3.6.4)

and so
                                            f (zi)(365
                                 X2 = X1 -   (xi)                            (3.6.5)

We continue in this manner to generate a sequence of approximations xo, x1, x2, x3, ... .
until we reach the desired degree of accuracy. Specifically, if we have found xn, we find
xn+1 by solving the equation Tn(x) = 0, where Tn is the best affine approximation to f at
xn. Hence we have
                           f (on)(on+1 - xn) + f (xn) = 0,                   (3.6.6)
which implies
                                             f(on)
                                          -   ,    .                         (3.6.7)

In other words, beginning with an initial guess xo , Newton's method generates a sequence
{xn} using the difference equation (3.6.7). In most cases (although certainly not all), if xo
is a good initial guess, lim xn = r, where r is a solution of f(x) = 0, that is, f(r) = 0.
                     n2-*00o


﻿


Section 3.6


Newton's Method


3


                                         2


                                         0.5

                    -2         -1                     1           2
                                       -0.5


                                       -1.5
                                       -2
                        Figure 3.6.2 Graph of f(x) = cos(x) - x


    In any practical case we need to know when to stop generating successive approxima-
tions using (3.6.7). Since we do not know the exact solutions to the equation (if we did,
we would not be using Newton's method to start with), we can never know for sure how
far a given approximation is from a solution. What is done in practice is to generate terms
until the difference between successive terms is less than a predetermined tolerance level.
That is, if we decide that we want our approximation to be off by no more than E, then
we stop when I+1 - xzl < C.
Newton's method        To approximate a solution to an equation f(x) = 0 to within a
tolerance of c beginning with an initial guess xo, compute the sequence of approximations
Xo, Xi, x2, x3, ..., using the difference equation

                                              f (xn)
                                 on2+1 = o2 -  ,    ,)                          (3.6.8)

stopping when xn+1 - xzl <cE.
Example Suppose we wish to find a solution to the equation cos(x) = x with an error
of no more than 0.0001. Then we should let f(x) = cos(x) - x and look for solutions to
f(x) = 0. Since cos(x) will always be between -1 and 1, we know that any solution to
cos(x) = x must lie in the interval [-1, 1]. Moreover, from the graph of f in Figure 3.6.2,
we can see that the equation f(x) = 0 has only one solution, and this solution lies between
0 and 1. Alternatively, we could note that

                                     f(0)   1 > 0

and
                                f (1) =cos(1) - 1 < 0,
which imply, by the Intermediate Value Theorem, that there is a solution in the interval
[0, 11. In either case, we will use xo= 0.5 for our initial guess. Now


f'(x)   -sin(x) - 1,


﻿


4


Newton's Method


Section 3.6


so


f(zo)
f'(1o)


      f (0.5)_
0.5 -        = (0- -)0.755222
      f'(0.5)


where we have rounded the result to 6 decimal places. Substituting this back into (3.6.8)
gives us


          f(cci)
X2 = X1 f(1)


f(0. 755222)
f'(0.755222)


0.739142.


Since


Iz2 - Xil= 0.016080 > 0.0001,


we continue and compute


          f (X2)
X3 = X2f'(2)


           f (0.739142)
0.739142 - f=(0.739142)   0.739085.
           f' (0.739142)


Now we have
                            Iza - X2|= 0.000057 < 0.0001,
so we stop and use 0.7391 as our approximation to the solution of cos(x) = x. For
comparison, with the bisection algorithm starting from the initial interval [0, 1], we would
have had to iterate 13 times before obtaining an approximation to the solution with an
error less than 0.0001.

Example As an example of where Newton's method goes wrong, consider the equation
f (x) = 0, where
                                   f x)=X'
                                           1+X2.
Clearly, xc= 0 is the only solution to this equation. However, beginning with an initial
guess zo = 0.75, Newton's method yields the following sequence (where we have rounded
each value to 5 digits):


c0
X1
12
13
14
cc5
cc6
cc7
18
1gc
10


0.75000
-1.9286
-5.2756
-10.944
-22.073
-44.237
-88.518
-177.06
-354.13
-708.27
-1416.5


﻿


Section 3.6


Newton's Method


                0.4

                .2

 ~X2


5


0O 2


x


                Figure 3.6.3 Newton's method diverging from a solution


Instead of converging to the solution at 0, this sequence seems to be diverging toward -oc.
In fact, geometrically this appears to be exactly the case, as can be seen in Figure 3.6.3.
Here the problem comes from the fact that the graph of f approaches 0 asymptotically as
x goes to -oc. Newton's method is following the curve as it approaches 0, as it should,
but, since there is no solution in this direction, the result is that the iterates are getting
farther and farther away from the solution at x = 0.

    Note that in the last example,

                                             1-
                                  f'1(x)=- 1
                                          (1 + x2)2'

giving us f'(1) = 0. Hence if we had started with an initial guess of xo = 1, an application
of the difference equation (3.6.8) would require a division by 0, which, of course, cannot be
done. Geometrically, the tangent line to the graph of f at (1, 0.5) is horizontal, and hence
never crosses the x-axis, implying that there are no solutions to the equation T(x) = 0 if T
is the best affine approximation to f at 1. Thus we must avoid starting Newton's method
at a point where the derivative is 0.

Problems

1. For each of the following equations, use a graph to obtain initial guesses for solutions
    and then apply Newton's method to locate the solutions within 0.0001.


(a) x5 - 6x3 + 2x = 2
(c) cos(t) = t2
(e) 2 sin(x) = x +1


(b) sin(x) = x2
(d) cos2(t) - t2 = 0
(f) 6x4 - 12x3 +4x - 1 = 0


2. Even though we know that for a positive number c the equation x2 - c = 0 has the exact
   solutions -V cand Vc, we may use Newton's methods to find decimal approximations
   to these square roots.


﻿


6


Newton's Method


Section 3.6


   (a) Show that the sequence o, X1, x2, ..., of Newton's method approximations to a
       solution of x2 - c = 0 satisfies the difference equation

                                                   c
                                          n+1


       for n = 0, 1, 2, ....
   (b) Use the difference equation from (a) to approximate v/2, 3v/, and /1 with an
       error of less than 0.00001.
   (c) Can you see an intuitive reason why, starting with a positive initial guess, the
       sequence defined by the difference equation in part (a) might converge to /?
   (d) Assuming that L =  lim cn for the sequence defined in (a), show that either
       L=-/ orL =c.
3. Use Newton's method to approximate 2 with an error less than 0.00001.
4. Use Newton's method to approximate 7 5 with an error less than 0.00001.
5. The method outlined in Problem 2 for approximating square roots was known to the
   Greeks and perhaps to the Babylonians. For an account of this and other aspects of
   Babylonian algebra, read Chapter 3 of Mathematics in Civilization by H. L. Resnikoff
   and R. 0. Wells, Jr. (Dover Publications, Inc., New York, 1984).
6. What happens when you apply Newton's method to find solutions to the equation

                                      x3 - 5x = 0

   starting with an initial guess of zo = 1? Explain this geometrically with a graph.
7. We know that when solving an equation f(x) = 0 using Newton's method, different
   initial guesses may lead to different solutions, and some may not converge to a solu-
   tion at all. The problem of determining which initial guesses converge to a specified
   solution is surprisingly complicated, involving what mathematicians call fractals. For
   an account of this phenomenon, read pages 217-220 of Chaos by James Gleick (Viking
   Penguin, Inc., New York, 1987). Also, see the picture on the sixth color plate following
   page 114 in the same book.


﻿


Section 3.7


                to                    Rolle's Theorem      and the
       Differential Equations         Mean Value Theorem


The two theorems which are at the heart of this section draw connections between the
instantaneous rate of change and the average rate of change of a function. The Mean
Value Theorem, of which Rolle's Theorem is a special case, says that if f is differentiable
on an interval, then there is some point in that interval at which the instantaneous rate
of change of the function is equal to the average rate of change of the function over the
entire interval. For example, if f gives the position of an object moving in a straight line,
the Mean Value Theorem says that if the average velocity over some interval of time is 60
miles per hour, then at some time during that interval the object was moving at exactly
60 miles per hour. This is not a surprising fact, but it does turn out to be the key to
understanding many useful applications.
    Before we turn to a consideration of Rolle's theorem, we need to establish another
fundamental result. Suppose an object is thrown vertically into the air so that its position
at time t is given by f(t) and its velocity by v(t) = f'(t). Moreover, suppose it reaches its
maximum height at time to. On its way up, the object is moving in the positive direction,
and so v(t) > 0 for t < to; on the way down, the object is moving in the negative direction,
and so v(t) < 0 for t > to. It follows, by the Intermediate Value Theorem and the fact that
v is a continuous function, that we must have v(to) = 0. That is, at time to, when f(t)
reaches its maximum value, we have f'(to) = 0. This is an extremely useful fact which
holds in general for differentiable functions, not only at maximum values but at minimum
values as well. Before providing a general demonstration, we first need a few definitions.
Definition A function f is said to have a local maximum at a point c if there exists an
open interval I containing c such that f(c) > f(x) for all x in I. A function f is said to
have a local minimum at a point c if there exists an open interval I containing c such that
f(c) < f(x) for all x in I. If f has either a local maximum or a local minimum at c, then
we say f has a local extremum at c.
    In short, f has a local maximum at a point c if the value of f at c is at least as large
as the value of f at any nearby point, and f has a local minimum at a point c if the value
of f at c is at least as small as the value of f at any nearby point. The next example
provides an illustration.
Example     Looking at the graph of the function f(x) =  3 - 3x in Figure 3.7.1, it appears
that f has a local maximum of 2 at x = -1 and a local minimum of -2 at x = 1. We will
confirm this observation in Section 3.8.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Rolle's Theorem and the Mean Value Theorem


Section 3.7


4


-3


1


2       3


-2


4


Figure 3.7.1 Graph of f(x) = x3 - 3x


    Now suppose f has local maximum at a point c
For small enough h > 0, f (c + h) f (c), so


and suppose f is differentiable at c.


thus


f (c + h)-f(c) < 0;


f (c + h)-f(c) <0.
       h


(3.7.1)


Clearly, if each term in a sequence is less than or equal to 0, and the sequence has a limit,
then the limit of the sequence must be less than or equal to 0. Hence


  .h< f (cf
h-0o+  h


(3.7.2)


Also, for h < 0 with Ihl small enough, we have f(c + h) f(c), and so

                                 f(c + h) - f(c) < 0.

However, now, since h < 0, we have


f(c +h)-f(c)>
       h


(3.7.3)


from which it follows that


  .h> f (c ) f
h - 0-  h


(3.7.4)


Note that (3.7.1) is saying that secant lines to the right of a local maximum have negative
slope, while (3.7.3) is saying that secant lines to the left of a local maximum have positive
slope. Now the only way that both (3.7.2) and (3.7.4) can hold at the same time is if
f'(c) = 0; that is, the only number which is both less than or equal to 0 and greater than


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


3


                      Figure 3.7.2 Illustration of Rolle's Theorem


or equal to 0 is 0 itself. Note that our argument here is just a refinement of our comments
about velocity in the previous paragraph.
    A similar argument gives the same result if f has a local minimum at c. The following
proposition puts these together into one statement.

Proposition    If f has a local extremum at c and f is differentiable at c, then f'(c) = 0.

Example     For f(x) = x3 - 3x, as in the previous example, f'(x) = 3x2 - 3, and so
f'(-1) = 0 and f'(1) = 0, consistent with our observation that f has a local maximum at
x = -1 and local minimum at x = 1. Note, however, that this does not prove that f has
local extrema at x = -1 and x = 1. Indeed, the proposition works in the other direction:
if f has a local extremum at c, then f'(c) = 0.

    This result will be very useful in our work in the next section when we consider the
problem of finding the maximum and minimum values of a given function. For our present
purpose, consider a function f which is continuous on a closed interval [a, b] and differen-
tiable on the open interval (a, b), with f(a) = f(b) = 0. An example of such a function
is shown in Figure 3.7.2. By the Extreme Value Theorem of Section 2.5, we know that f
must have both a minimum value m and a maximum value M on [a, b]. If m = M = 0,
then f(x) = 0 for all x in (a, b), and so f'(x) = 0 for all x in (a, b). If either m   0 or
M     0, then f has a local extremum at some point c in (a, b), namely, either a point c
for which f(c) = m or a point c for which f(c) = M. Hence, by the previous proposition,
f' (c) = 0. We have thus established the following theorem, credited originally to Michel
Rolle (1652-1719).

Rolle's Theorem     If f is continuous on [a, b], differentiable on (a, b), and f(a) = f(b) =
0, then there exists a point c in (a, b) such that f'(c) = 0.

    Put another way, Rolle's theorem says that if f is a differentiable function, then be-
tween any two solutions of the equation f(x) = 0 there is a point c where f'(c) = 0. Used


﻿


4


Rolle's Theorem and the Mean Value Theorem


Section 3.7


a             c


b


                  Figure 3.7.3 Illustration of the Mean Value Theorem

in conjunction with the Intermediate Value Theorem, this result can help identify intervals
where an equation has a unique solution.
Example Solving the equation
                                     x5 + x4 = 1                                (3.7.5)
is equivalent to solving the equation f (x) = 0 where f (x) = x5 + x4 - 1. Since f(0) = -1
and f(1) = 1, the Intermediate Value Theorem tells us that f(x) = 0 has at least one
solution in (0, 1). Moreover,
                                  f'(x) = 5x4 + 4x3
so f'(x) > 0 for all x in (0, 1); in particular, there does not exist a point c in (0, 1) such
that f'(c) = 0. Hence, by Rolle's Theorem, there cannot be two solutions to f(x) = 0 in
(0, 1). That is, using the Intermediate Value Theorem and Rolle's Theorem together, we
are able to conclude that there is exactly one solution to (3.7.5) in the interval (0, 1). We
may now use either the bisection algorithm or Newton's method to locate this solution.
    Geometrically, Rolle's theorem says if f is a function which satisfies the conditions of
the theorem on an interval [a, b], then there is a point c in (a, b) such that the line tangent
to the graph of f at (c, f(c)) is horizontal. In this case, that means that the line tangent
to the graph of f at (c, f(c)) is parallel to the line passing through the points (a, f(a)) and
(b, f(b)), as is seen in Figure 3.7.2. Certainly, if we took this picture and rotated or shifted
the points (a, f(a)) and (b, f(b)), rigidly moving the graph with these points, then this
conclusion would still follow. That is, if f is continuous on the closed interval [a, b] and
differentiable on (a, b), then there must exist a point c in (a, b) such that the line tangent
to the graph of f at (c, f(c)) is parallel to the line passing through the points (a, f(a)) and
(b, f(b)) (see Figure 3.7.3). In other words, there must be a point c in (a, b) such that

                                 f'(c) =            .                           (3.7.6)
                                            b - a


This is the content of the Mean Value Theorem.


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


5


    Although the above argument for this result seems plausible, we will present a more
precise argument. Define a new function g by

                                 g(x) =_f(x) - S(x),                            (3.7.7)

where
                         5(x) =(f(b bf(a))(x - a) + f(a).                       (3.7.8)

Geometrically, the graph of S is a line passing through the points (a, f(a)) and (b, f(b)),
and g(x) is the distance from the graph of f to the graph of S above the point x (see
Figure 3.7.3). Now g is continuous on [a, b] and differentiable on (a, b); moreover,

                         g(a) = f (a) - S(a) = f (a) - f (a) = 0                (3.7.9)

and
                         g(b) = f (b) - S(b) = f (b) - f (b) = 0.         (3.7.10)
Thus g satisfies the conditions of Rolle's theorem. Hence there exists a point c in (a, b)
such that g'(c) = 0. But
                                             f(b) - f(a)                       (..1

and so g'(c) = 0 implies
                               f(c) -f(b)    f(a)    0,                        (3.7.12)

that is,
                                 f'(c) f(bb -       .(a)                       (3.7.13)

Mean Value Theorem         If f is continuous on [a, b] and differentiable on (a, b), then
there exists a point c in (a, b) such that


                                 f'(c) f(bb-.f(a)                              (3.7.14)


Increasing and decreasing functions
Similar to the situation with the Intermediate Value Theorem and the Extreme Value
Theorem, the Mean Value Theorem is an existence theorem. The point of interest is the
existence of c, not in being able to compute a value for c. Although not immediately
useful for computations, we will see that the Mean Value Theorem has many important
consequences. The first of these, which we will consider now, involves determining when a
function is increasing and when it is decreasing.
Definition We say a function f defined on an interval I is increasing on I if for every
two points 11 and v in I with 11 < v, f(u) < f(v). We say a function f defined on an
interval I is decreasing on I if for every two points 11 and v in I with 11 < v, f(u) > f(v).


﻿


6


Rolle's Theorem and the Mean Value Theorem


Section 3.7


Example      The function f(x) = x2 is increasing on the interval [0, oc) since for any two
numbers u and v with 0 < u < v, f(u) u=  2 < v2 = f(v). Moreover, f is decreasing on
(-oc, 0] since for any two numbers u and v with u1< v < 0, f(u) = u2 > v2 = f(v).
    Now suppose f is differentiable on an interval (a, b) with f'(x) > 0 for all x in (a, b).
If u and v are two points in (a, b) with u < v, then, by the Mean Value Theorem, there
exists a point c with u < c < v such that


                                  f'(c) = f (v)-f (u) .(3.7.15)

Since c is in (a, b), f'(c) > 0, so, using (3.7.15),

                             f (v) - f (u) = f'(c)(v - u) > 0.                   (3.7.16)

Hence f(v) > f(u) and f is increasing on the interval (a, b). Similarly, if f'(x) < 0 for all
x in (a, b), then we would have f'(c) < 0, from which it would follow that f(v) > f(u)
and, hence, that f is decreasing on (a, b). In short, to determine the intervals on which a
differentiable function is increasing and those on which it is decreasing, we need to look
only for the intervals on which the derivative is positive and those on which it is negative,
respectively.
Proposition If f is differentiable on (a, b) and f'(x) > 0 for all x in (a, b), then f is
increasing on (a, b). If f is differentiable on (a, b) and f'(x) < 0 for all x in (a, b), then f
is decreasing on (a, b).
    Geometrically, this proposition is saying that a function is increasing where it has
positive slope and decreasing where it has negative slope. This should seem intuitively
clear, but it is the Mean Value Theorem which makes the connection between average
rates of change and instantaneous rates of change necessary for establishing the result.
Example Suppose f (x) = 2x3 + 3x2 - 12x + 1. To determine where f is increasing and
where it is decreasing, we first find

                        f'(x) = 6x2 + 6x - 12 = 6(x + 2)(x - 1).                 (3.7.17)

Hence f'(x) = 0 only when x = -2 or x = 1. Since f' is continuous, the Intermediate
Value Theorem implies that f' cannot change sign on the intervals (-oc, -2), (-2, 1), and
(1, oc). Since f'(-3) = 24 > 0, it follows that f'(x) > 0 for all x in (-oc, -2). Similarly,
since f'(0)   -12 < 0, f'(x) < 0 for all x in (-2, 1); and, since f'(2) = 24 > 0, f'(x) > 0
for all w in (1, oo). It now follows from the previous proposition that f is increasing on the
intervals (-oc, -2) and (1, oc) and decreasing on the interval (-2, 1).
    Note that we could obtain the same information about f' directly from (3.7.17) without
evaluating f' and without invoking the Intermediate Value Theorem. Namely, from the
facts that w + 2 < 0 and w - 1 < 0 whenever w < -2, we may conclude from the (3.7.17)
that f'(w) > 0 for all w in (-oc, -2). Similarly, whenever -2 < w < 1, we have w + 2 > 0
and w - 1 < 0, implying that f'(w) < 0 for all w in (-2, 1); and whenever w > 1 we have
w + 2 > 0 and w - 1 > 0, implying that f'(w) > 0 for all w in (1, oc).


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


7


40
30


-2


2        4


-10
-20
-30
-40


Figure 3.7.4 Graph of f (x)


2x3 + 3x2 - 12x + 1


    Combining our information on intervals where f is increasing and intervals where f is
decreasing with the facts that f (-3) = 10, f (-2) = 21, f (0) = 1, f (1) = -6, f (2) = 5,


lim f (x)
x--oo


lim x3 2
x -o0


  3
+-
  I


12
x2


   1 /
+
  w33


-O,


and


lim f (x) = limx3 (2+
x-0m        x-m0


3
w


12
x2


   1
+
  w33


00,


we can understand why the graph of f looks as it does in Figure 3.7.4.


Example Now consider f (x) = x5


x3. Then


f'(x) = 5x4 - 3x2


x2(5x2 - 3),


(3.7.18)


so f'(x) = 0 when


3 =


0, orwx
      355


Now when


3 > 0, implying, from (3.7.18), that f'(x) > 0. For


both x2 > 0 and 5x2


                                     - <xz<O0,
                                      5

x2 > 0, but 5x2 -3<0, so f'(x) <0; for

                                              3
                                    0<w<       ,


﻿


8


Rolle's Theorem and the Mean Value Theorem


Section 3.7


x2 >0and5x2


3 < 0, so f'(x) < 0; and for


      3
cc -,5


x2 > 0 and 5x2 - 3 > 0, so f'(x) > 0. Hence f is increasing on


(


( 3


and


and decreasing on


and


(- ,o)


(o ).


    Note that we could have determined the sign of f' on these four intervals by evaluating
f' at a point in each interval and then applying the Intermediate value Theorem. For
example,
                                  f'(-1) = 2 > 0,


f1 ()

    5 (


  14
-     < 0,
  125
  14
125


and


f'(1) = 2 > 0.


As in the previous example, if we combine this information with the facts

                                 f(-1) = 0,


f(       )

      f(0) = 0,


f(   5/

     f(1)


63C
25   ='


0,


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


9


                                          2
                                          1.5
                                          1
                                          0.5

                    -3      -2       1             1       2      3
                                        -0.5


                                        -1.5
                                        -2
                         Figure 3.7.5 Graph of f(x) =x5z - X3


                         lim  f(x) =  limx5       - _       _

and
                          lim f(x) = limxs i _-1)     c,
                          x-~oo       x-co     o\Xl
we can understand why the graph of f looks as it does in Figure 3.7.5.

Antiderivatives
We will close this section with a look at one more important application of the Mean Value
Theorem. Although not needed for our current discussion, our result will be very useful
in the next chapter. We begin with a definition.
Definition    If F and f are functions defined on an open interval (a, b) such that F'(x)
f(x) for all x in (a, b), then we call F an antiderivative of f.
    In other words, an antiderivative of a function f is another function whose derivative
is f. Although a given function f has at most one derivative, it is possible to have more
than one antiderivative, as the next example demonstrates.
Example     F(x) = x3 is an antiderivative of f(x) = 3x2 on (-oc, oc). However, note
that G(x) = x3 + 4 is also an antiderivative of f. In fact, given any constant k,

                                    H(x) =x3 + k                                (3.7.19)

is an antiderivative of f. This should not be too surprising since specifying the derivative
of a function fixes only the slope of its graph, and the graphs of the functions in (3.7.19)
are all in a sense parallel to each other.
    This example shows that a given function may have an infinite number of antideriva-
tives. However, note that the difference of any two these antiderivatives is a constant. We
will now show that this is always the case, and, in particular, that (3.7.19) specifies all
possible antiderivatives of f (c) =3cc2.


﻿


10


Rolle's Theorem and the Mean Value Theorem


Section 3.7


    First consider a function F defined on an open interval (a, b) for which F'(x) = 0 for
all x in (a, b). That is, F is an antiderivative of the function which is 0 for all values of x
in (a, b). Now if u and v are any two points in (a, b), then, by the Mean Value Theorem,

                                  F(v) - Fu)
                                               = F'(c)                           (3.7.20)


for some c in (a, b). But F'(c) = 0, so

                                    F(v) - F(u)
                                                  0, (3.7.21)
                                       v-ui

which implies F(u) = F(v). If we let k = F(u) for a fixed u in (a, b), we now have
F(v) = F(u) = k for all v in (a, b). In other words, if the derivative of a function is 0 on
an open interval, then the function must be constant on that interval.
    Now suppose F and G are two functions defined on an open interval (a, b) for which
F'(x) = G'(x) for all x in (a, b). Let H(x) = F(x) - G(x) for all x in (a, b). Then

                               H'(x) = F'(x) - G'(x) = 0                         (3.7.22)

for all x in (a, b). Then, by what we have just shown, there exists a constant k for which

                                k = H(x) = F(x) - G(x)                           (3.7.23)

for all x in (a, b). Hence if F an G have the same derivative on an open interval, that is,
are antiderivatives of the same function, then they can differ only by a constant.

Proposition If F and G are both antiderivatives of f on an open interval (a, b), then
there exists an constant k such that

                                    F(x) = G(x) + k                              (3.7.24)

for all x in (a, b).

Example      Since
                                    d
                                      sin(x) - cos(x),
                                   dxr
we know that C(x) =sin(x) is an antiderivative of f(x) =cos(x) on (-oo, oo). Thus if F
is any antiderivative of f, then
                                   F(x) =sin(x) + k                              (3.7.25)

for some constant k. In other words, functions of the form given in (3.7.25) are the only
antiderivatives of f(x) =cos(x). Figure 3.7.6 shows the graphs of (3.7.25) for nine different
values of k. Although each of these graphs is the graph of a different function, they are


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


11


3

2


. 1 1 79 1 :; - -4;-- 1 N 1 N, 3t .- . - , 7' 7r ]K , -.-, N N N . - . 2or , , ,


-


                -3


2


            Figure 3.7.6 Graphs of F(x) = sin(x) + k for different values of k


parallel to one another in the sense that they all have the same slope at any given value
of x, namely, cos(x).

Problems

1. Explain why the equation cos(x) = x has exactly one solution in the interval [0, 1].
2. Explain why the equation x4 - 2x2 = 2 has exactly one solution in the interval [1,2]
    and exactly one solution in the interval [-2, -1].
 3. Suppose f is continuous on [a, b] and differentiable on (a, b). Moreover, suppose there
    is a constant M such that |f'()|  M for all x in (a, b). Show that

                                  |f(v)-f(u)| I<Mlv - ul


4.
5.


for all u and v in [a, b].

Use Problem 3 to show that |sin(x) - sin(y)| < I - y| for all values of x and y.
For each of the following functions, identify the intervals where the function is increas-
ing and where it is decreasing. Use this information to sketch the graph.

(a) f (x) = x2 -3                  (b) g(t) = 3t2 + t - 6

(c) h (z)=                         (d) f (x)=


(e) f(t)


  t
t2+1


(h) f(x) = 4x5 - 15x4 - 20x3 + 110x2 - 120x


   (g) y(t) = t2 _-4

6. Let f(x) = . Then


f(1) - f(-1)
  1-(-1)


0


0,


﻿


12


Rolle's Theorem and the Mean Value Theorem


Section 3.7


    but there does not exist a point c in (-1, 1) such that f'(c) = 0. Does this contradict
    the Mean Value Theorem?
 7. Suppose f is continuous on [a, b] and differentiable on (a, b).
    (a) Show that if f'(x) > 0 for all x in (a, b), then f is increasing on [a, b].
    (b) Show that if f'(x) <0 for all x in (a, b), then f is decreasing on [a, b].
 8. Suppose f and g are continuous on [a, b], differentiable on (a, b), f(a) = g(a), and
    f'(x) <g'(x) for all xin (a, b). Show that f(b) < g(b).
                             1
 9. Show that    1 +cxc< 1 + -xfor x > 0.
                             2
                   1       1
10. Show that a + -<b+-whenever1 < a < b.
                   a       b
11. Find antiderivatives for the following functions.
    (a) f (x) = 2x                              (b) g(t) = t2
    (c) g(x) = sin(x)                           (d) f (z) = sin(2z)
    (e) h(x) = x2 - 3x                          (f) f (x) = 3 cos(4x)
12. Find all antiderivatives of f(x) = 3x2 - 3 and plot the graphs of six of them.
13. Find all antiderivatives of g(t) = sin(2t) and plot the graphs of six of them.
14. If f(x)   - sin2(x) and g(x) = cos2(x), then f'(x) = g'(x). What does this imply
    about the relationship between the functions f and g?
15. If f (t) = tan2(t) and g(t) = sec2(t), then f'(t) = g'(t). Thus f(t) = g(t) + k for some
    constant k. Evaluate f and g at t = 0 in order to determine k.
16. Suppose f is differentiable on an open interval containing the closed interval [a, b].
    (a) Show that for any c in (a, b),

                                   f(x) = f(a) + f'(c)(x - a)

        for some point c with a < c < c.
    (b) Let f" denote the second derivative of f. That is,


                                       f"(X) =Jf'(cX).

        Assuming that f' is continuous on [a, b] and differentiable on (a, b), show that there
        exists a point d with a < d < c such that

                        f(cc) =f(a) + f'(a)(c - a) + f"(d)(c - a)(cc - a).

    (c) Compare the results in (a) and (b) to the statement that f(cc) ~T(cc) for cc close
        to a, where T is the best affine approximation to f at a.


﻿


Section 3.7


Rolle's Theorem and the Mean Value Theorem


13


(d) Let h = x - a. Show that

                         f(a + h) - T(a + h) = f"(d)(c - a)h.

(e) Assuming f" is continuous on [a, b], show that

                              f(a + h) - T(a + h) <M
                                       h2<M

    for some constant M. This statement means that the remainder function

                             R(h) = f (a + h) - T(a + h)

    is O(h2). That is, R(h) goes to 0 as least as fast as h2. Note that this is a stronger
    statement than the statement that R(h) is o(h).


﻿


Section 3.8


                to                    Finding Maximum         and
       Differential Equations         Minimum      Values


Problems involving finding the maximum or minimum value of a quantity occur frequently
in mathematics and in the applications of mathematics. A company may want to maximize
its profit or minimize its costs; a farmer may want to maximize the yield from his crop
or minimize the amount of irrigation equipment needed to water his fields; an airline may
want to maximize its fuel efficiency or minimize the length of its routes. Methods for
solving some optimization problems are so computationally intense that they challenge,
and sometimes even go beyond, the fastest computers currently available. An example of
such a problem is the famous traveling salesman problem, in which a salesman wishes to
visit a certain set of cities using the shortest possible route. In this section we will not
consider problems of this type, but rather we will confine ourselves to problems involving
continuous functions of a single independent variable.

Closed intervals
We will start with the simplest case. Suppose f is a continuous function on a closed interval
[a, b]. From the Extreme Value Theorem we know that f attains both a maximum value
and a minimum value on the interval. We now look for candidates at which these values
might occur. To start, an extreme value could occur at one of the endpoints. For example,
the maximum value of f(x) = x2 on [0, 1] occurs at x = 1. If an extreme value occurs in
the open interval (a, b) at a point c where f is differentiable, then f has a local extremum
at c and so, from our work in Section 3.7, we know that f'(c) = 0. For example, the
minimum value of f(x) = x2 on [-1, 1] occurs at x = 0 and f'(0) = 0. Finally, the only
other candidates for the locations of extreme values would be points where f' is undefined.
For example, the minimum value of f(x) = x on [-1, 1] occurs at x = 0, where f' is not
defined. Hence we are led to the following conclusion: The extreme values of a continuous
function f on a closed interval are located either at the endpoints of the interval, at points
where f' is 0, or at points where f' is undefined. The following terminology will help us
state this more easily.

Definition   If f is differentiable at c and f'(c) = 0, then we call c a critical point or
stationary point of f. A point c at which the derivative of f is not defined is called a
singular point of f.

    Thus we know that the candidates for the location of the extreme values of a continuous
function on a closed interval fall into three categories: (a) endpoints of the interval, (b)
critical points, and (c) singular points. To determine the extreme values of such a function
f, we identify all these points, evaluate f at each one, and identify the largest and smallest
values.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Finding Maximum and Minimum Values


Section 3.8


   2
   1.5
   1
   0.5

            1      2
 -0.5
 -1
 -1.5
 -2

Figure 3.8.1 Graph of g(t)


6


cos(t) + sin(t) on [0, 27]


Example Suppose we wish to find the maximum and minimum values of

                                g(t) =cos(t) + sin(t)


on the interval [0,27r]. Then


g'(t) = - sin(t) + cos(t),


so g'(t) = 0 when


cos(t) = sin(t).


(3.8.1)


Now cos(t) and sin(t) are never simultaneously 0, so we may divide both sides of (3.8.1)
by cos(t) to see that g'(t) = 0 when

                                     tan(t) = 1.

Considering only the interval [0, 27], this implies t = 4 or t = 4. Since there are no
singular points, we evaluate g at the endpoints and at the critical points:

                            g(0) = 1


g(4)

  g\4 I


1   1
+ /2
1      1
2       2


2


g(27) = 1


Thus g has a maximum value of   2 at t = } and a minimum value of -V 2at t = 4. See
Figure 3.8.1.


﻿


Section 3.8


Finding Maximum and Minimum Values


3


2


                   1.5
                     1
                   0.5

                            0.5     1     1.5     2      .      3
                  -0.5
                  -1
                  -1.5
                  -2

                  Figure 3.8.2 Graph of g(t) =cos(t) + sin(t) on [0,7r]


Example Note that if the interval in the previous example had been [0, r], then the only
critical point would be . In this case we would evaluate:

                              g(0) = 1
                                    T1 1
                                 g()-     +         /2

                              g(7) =--1

Hence the maximum value of g on [0, r] is, as before, 2 at x = 4, but the minimum value
of g on [0, r] is -1 at t =w7. See Figure 3.8.2.

Example     Consider the function f(x) = x3 on the interval [-1, 1]. Then

                                        2X1      2
                                f'(x) = -x-     _,
                                        3       x 3

which is never 0, but is undefined at 0. Thus f has a singular point at 0, but no critical
points in [-1, 1]. To find the extreme values of f, we evaluate:

                                     f(-1)=1
                                     f(0) = 0
                                     f(1) = 1

Hence f has a minimum value of 0 at x =0 and a maximum value of 1 which occurs at
both x =-1 and x =1. See Figure 3.8.3.

Example A quality control engineer wishes to determine the probability that a certain
type of light bulb will fail within 1000 hours of use. To do so, she tests 100 such light bulbs
for 1000 hours each and finds that 20 of them failed within that time period. If p is the


﻿


4


Finding Maximum and Minimum Values


Section 3.8


1


0.8


.1


0.5


0.5


1


                       Figure 3.8.3 Graph of f(x) =cc3 on [-1, 1]


probability that a single light bulb fails within 1000 hours, the probability of the observed
sequence of successes and failures is given by

                                  L(p) =p20(1 -P)80

with 0 < p < 1. Note here that 1 - p is the probability that a single light bulb does not
fail in the 1000 hour test. If we think of p as representing the proportion of all such light
bulbs that will fail within 1000 hours, then L(p) represents the proportion of times that a
sequence of 100 tests will yield the observed sequence of successes and failures. The quality
control engineer would like to use this information to estimate p, the true probability of
failure for this particular type of light bulb. One common procedure is to estimate p by the
value p which maximizes the probability of the observed sequence; that is, p is the value
of p that makes the given observations most likely to occur. Hence we want to maximize
the function L on the interval [0, 1]. To find the critical points, we compute

                     L'(p) = p20 (80(1 - p)79(-1)) + (1 - p)80 (20p19)
                             -80p20(1 - p)79 + 20p19(1 - p)80
                             =20p19(1 - p)79 (-4p + (1 - p))
                           = 20p19(1 - p)79(1 - 5p).

Thus L(p) = 0 when p = 0, p = 0.2, or p = 1, and so the only critical point in the interval
(0, 1) is 0.2. Evaluating L at the endpoints and at the critical point yields:

                         L(0) = 0
                         L(0.2) =(0.2)20(0.8)80 = 1.853 x 10-22
                         L(1) = 0

Thus the quality control engineer would take the value p = 0.2 as her estimate of the
probability that this type of light bulb will fail within the first 1000 hours of use. See
Figure 3.8.4.


﻿


Section 3.8


Finding Maximum and Minimum Values


5


                    2 -10-22

                    1.5 -"10-22

                    1 -"10-22

                    5 -"10-23


                                  0.2      0.4     0.6     0.8      1

                   Figure 3.8.4 Graph of L(p) = p20_(1 - p)80 on [0, 1]


    The answer in the previous example should not be surprising: Given that 20 out of
100 light bulbs in the sample failed within 1000 hours, it seems evident that our best
guess for the probability that a randomly chosen light bulb manufactured by this process
will fail in less than 1000 hours is o% = }. However, our example illustrates a general
technique, called maximum likelihood estimation, which is widely used in applications to
estimate statistical parameters. Moreover, there are other methods which would yield
different answers, one popular alternative being 10.

Open intervals
Finding the extreme values of a continuous function f on an interval I which is not closed
introduces some new problems. Foremost among these is that there is no guarantee, like
the Extreme Value Theorem, that extreme values even exist. However, the following special
case arises frequently in practice and may be handled routinely. Suppose I = (a, b) is an
open interval, f and f' are continuous on I, f has a critical point at a point c in I which
has been determined to be the location of a local minimum, and f has no other critical
points in I. Then, by the Intermediate Value Theorem, f' cannot change sign on (a, c).
Hence, since there is a local minimum at c, f'(x) <0 for all x in (a, c). Similarly, we must
have f'(x) > 0 for all x in (c, b). Thus f is decreasing on (a, c) and increasing on (c, b),
so f(c) must be the minimum value of f on (a, b). An analogous argument shows that if,
under these conditions, f has a local maximum at c, then f(c) is the maximum value of f
on (a, b). The following proposition summarizes these statements.
Proposition Suppose f and f' are continuous on an open interval (a, b) and c is the
only critical point of f in (a, b). If f has a local minimum at c, then f(c) is the minimum
value of f on (a, b); if f has a local maximum at c, then f(c) is the maximum value of f
on (a, b).
    Of course, to make use of the proposition, we must first have a method for determining
the location of local extreme values. Given a differentiable function f on an open interval
I, we know that the local extreme values will occur only at critical points; hence, the first
step in determining the location of local extreme values is to find all the critical points of


﻿


6


Finding Maximum and Minimum Values


Section 3.8


4


-3


1


2       3


-2


4


                         Figure 3.8.5 Graph of f(x) = x3 - 3x

f in I. A given critical point c may then be classified as the location of a local maximum,
a local minimum, or neither by examining the behavior of f' on either side of c. That is,
if f' is negative to the left of c and positive to the right of c, then f is decreasing before
c and increasing after c, making c the location of a local minimum. Conversely, if f' is
positive to the left of c and negative to the right of c, then f is increasing before c and
decreasing after c, making c the location of a local maximum. If f' is either negative on
both sides of c or positive on both sides of c, then f has neither a local minimum nor a
local maximum at c. The procedure just described is sometimes referred to as the first
derivative test for local extrema.
Example Suppose f (x) = x3 - 3x. Then

                     f'(x) = 3x2 - 3 = 3(x2 - 1) = 3(x - 1)(x + 1),

so f'(x) = 0 when x  -1 or x  1. Now x2 - 1 <0 when -1 <cc < 1 and x2 - 1 > 0
for all other values of x. Hence f'(x) > 0 when x < -1 or x > 1, and f'(x) < 0 when
-1 < c < 1. Hence f is increasing on (-oo, -1) and on (1, oo), and f is decreasing on
(-1, 1). Thus f changes from increasing to decreasing at x = -1, implying that f has
a local maximum at this point, and f changes from decreasing to increasing at x = 1,
implying that f has a local minimum at this point. Since f(-1) = 2 and f(1) = -2, we
conclude that f has a local maximum of 2 at x = -1 and a local minimum of -2 at x = 1.
Note that this verifies the claim we made in Section 3.7 after looking at the graph of f
(which is repeated in Figure 3.8.5).
    Our next step will require the introduction of the derivative of f', called the second
derivative of f, and denoted f".
Example If f (x) = x3 - 3X2, then

                                   f'(x) = 3x2 - 6x

and
                                    f"(x) = 6x - 6.


﻿


Section 3.8


Finding Maximum and Minimum Values


7


    In Leibniz notation, if y = f(x), then f"(x) is denoted by % , which may be thought
of as
                                  d2      d   d
                                  dx 2 Y dx Kdam }

In Newton's notation, if x = f(t), then z = f"(t).

Example If y = x sin(3x), then

                              dy
                              d= 3x cos(3x) + sin(3x),
                              do

so

                       d-2  d(3x cos(3x) + sin(3x))
                             -9x sin(3x) + 3 cos(3x) + 3 cos(3x)
                             -9x sin(3x) + 6cos(3x).


Example If x = 3t6 - 2 cos(6t), then

                                     18t5 + 12 sin(6t)

and
                                  = 90t4 + 72 cos(6t).

    Now suppose f, f', and f" are all continuous on an interval (a, b) containing a critical
point c. Moreover, suppose f"(c) < 0. Since f" is continuous, this assumption in fact
implies that f"(c) < 0 on some open interval about c, and hence that f' is a decreasing
function on some open interval about c. But f'(c) = 0, so for f' to be decreasing it must
be the case that f'is positive to the left of c and negative to the right of c. This means, by
the first derivative test discussed above, that f must have a local maximum at c. Similarly,
if f"(c) > 0, then f'is negative to the left of c and positive to the right of c, showing that
f has a local minimum at c. This important result is known as the second derivative test
for local extrema

Second Derivative Test Suppose f, f', and f" are all continuous on an open interval
(a, b) and that c is critical point of f in (a, b). Then f has a local maximum at c if f"(c) < 0
and f has a local minimum at c if f"(c) > 0.

Example     Consider the function g(x)=   2.Te

                              (1 +x2)(1) - (x)(2x) _1 -x2
                          g~j(1 +x2)2(1+22>


﻿


8


Finding Maximum and Minimum Values


Section 3.8


0.6

0.4

0.2


2       4       6


-0.6


                         Figure 3.8.6 Graph of g(x)  X
                                                     1 + X

so g'(x) = 0 when 1 - x2 = 0. Thus g has two critical points, x = -1 and x = 1. Now

                  g"( )_(1 + x2)2(-2x) - (1 - x2)(2(1 + x2)(2))
                                         (1+ X2)4
                          -2x(1 + x2) - 4x(1 - x2)
                                  (1+ x2)3
                          -6x + 2x3
                          (1 + X2)3
                          2x(x2 - 3)
                          (1+ cX2)3

Thus g"(-1) = 0.5 > 0 and g(1) = -1 < 0, implying that g has a local minimum at
x = -1 and a local maximum at x = 1. Since g(-1) = -0.5 and g(1) = 0.5, we conclude
that g has a local minimum of -0.5 at x = -1 and a local maximum of 0.5 at x = 1.
    Using the facts that the critical points of g are -1 and 1 and that g has a local minimum
at x = -1, we may conclude that g must be decreasing on (-oc, -1) and increasing on
(-1, 1). Similarly, g having a local maximum at 1 implies that g must be increasing on
(-1, 1) and decreasing on (1, oo). Moreover,
                                                      1
                  lim  g(x) =  lim       2=   lim   1X       -0
                  x--o        x--oo 1+ x2        _1+ 1       1


and
                                                    1
                                                       -   0
                   lim g(x) = lim         = lim    X       -=0,
                   x-oo       x-~o 1 +      x oo1 + 1      1

showing that the x-axis is a horizontal asymptote for the graph of g. Putting these obser-
vations together, we can see why the graph of g looks as it does in Figure 3.8.6.


﻿


Section 3.8


Finding Maximum and Minimum Values


9


-3     -2


2       3


-4


                                        -6

                        Figure 3.8.7 Graph of f(x) = 5x3 - 3x5


Example Suppose f (x) = 5x3 - 3x5. Then

               f'(x) = 15x2 - 15x4 = 15x2(1 - x2) = 15x2(1 - x)(1 + x),

from which we see that the critical points of f are -1, 0, and 1. Now

                                 f"(x) = 30x - 60x3,

so f"(-1) =30 > 0, f"(0) =0, and f"(1) = -30 < 0. Thus f has a local minimum of -2
at x = -1 and a local maximum of 2 at x = 1. Unfortunately, f"(0) = 0, so the second
derivative test gives us no information about the nature of the critical point 0. However,
since f has a local minimum at x = -1, f must be decreasing on (-oo, -1) and increasing
on (-1, 0); moreover, since f has a local maximum at x = 1, f must be increasing on (0, 1)
and decreasing on (1, oc). Thus f is increasing on both (-1, 0) and (1, 0), from which we
conclude that f has neither a local minimum nor a local maximum at x = 0. If we add in
that f (0) = 0,

                         lim  f (x) = lim  x5  2 - 3  = 00,
                         x -co       x --o     X      /

and
                         lim f (x) = lim w5o 2 - 3  =

we can see why the graph of f look as it does in Figure 3.8.7.

    It is worth emphasizing that, for a critical point c, f"(c) = 0 gives us no information
about the nature of the critical point. The point may be the location of a local minimum,
as 0 is for f(x) = x4; a local maximum, as 0 is for f(x) =_-x4; or neither, as 0 is in the
previous example. Thus, if f"(c) = 0 for a critical point c, the second derivative test is
not applicable and some other method, such as the first derivative test, must be used to
determine the nature of the point.


﻿


10


Finding Maximum and Minimum Values

6

4


Section 3.8


2


5      6


2


                     Figure 3.8.8 Graph of g(t) = t cos(t) - 2 sin(t)


    Now that we have techniques for determining the location of local minimums and max-
imums, we can return to our original problem of determining the maximum or minimum
value of a function on an open interval.
Example Suppose we want to find the minimum value of

                                    f (t) -sin(t)

on the interval (0,27r). Then


                   f,(t)   t2 cos(t) - sin(t) (2t) _t cos(t) - 2 sin(t)

so f'(t) = 0 when t cos(t) - 2 sin(t) = 0. We cannot solve this equation exactly, but, from
the graph of g(t) = t cos(t) -2 sin(t) in Figure 3.8.8, we can see that it has only one solution
in the interval (0,27r). Applying Newton's method with initial guess to = 4 , we obtain
the approximation 4.2748, to four decimal places. Hence f has exactly one critical point
in (0, 27), namely, 4.2748. Now


t3(-tsin(t) + cos(t) - 2cos(t)) - (tcos(t) - 2 sin(t))(3t2)


f"(t)


-t4 sin(t) - 4t3 cos(t) + 6t2 sin(t)

6 sin(t) - 4t cos(t) - t2 sin(t)


so f"(4.2748) = 0.05499 > 0. Thus f has a local minimum at t = 4.2748. Moreover, since
4.2748 is the only critical point in (0, 27) and f(4.2748) = -0.04957, we may conclude
that the minimum value of f on (0,27) is -0.04957, and this value occurs at t = 4.2748.
See Figure 3.8.9.


﻿


Section 3.8


Finding Maximum and Minimum Values


11


1

0.8

0.6

0.4

0.2


1      2      3


6


                         Figure 3.8.9 Graph of f(t) = sin(t)2


Example A company wishes to produce a metal can in the shape of a right circular
cylinder which minimizes the amount of metal needed in its construction, yet will have a
volume of V cubic centimeters. If we denote the radius of the base of the can by r, the
height of the can by h, and the surface area of the can by S, then


S =2772 + 27rh,


(3.8.2)


where the first term represents the combined area of the base and the top of the can and
the second term represents the area of the side of the can, which, when flattened out, is a
rectangle of length 27r (the circumference of the base of the can) and width h. Our goal
is to find the values of r and h which minimize S subject to the constraint that the can
has to hold a volume V. This constraint translates into the condition
                                    V = 7r2h,
which means we must have


h = V2 .
     Vr


(3.8.3)


r


h


Figure 3.8.10 A cylindrical can


﻿


12                 Finding Maximum and Minimum Values                      Secti

Substituting the value of h in (3.8.3) into (3.8.2), we have


                                   S =27r2 + 2V
                                                r

giving S as a function of r which we want to minimize on the interval (0, oc). Now


on 3.8


(3.8.4)


dS
dr


      2V
4wrr - _


so d = 0 when


that is, when


       2V
47rr = r


r3    2V
      2w7


      2w


Hence


is the only critical point of S in (0, oo). Moreover,


d2S
dr2


      4V
 47r +
       4 737 1


4wr + 8wr 12wr> 0.


so


d2S
dr2


.3 V
   2ir


Hence S has a local minimum at

                                           2w

Since this is the only critical point in (0, oc), this is
value of S. Now for
                                          3 V
                                            2wr'


in fact the location of the minimum


we have, using (3.8.3),


V _V
h w
         \27r /


25wiV
wV


2V -
    2w


2r.


That is, the surface area of the can, and hence the amount of metal used in the can, is
minimized when the height of the can is equal to the diameter of the can. Figure 3.8.11


﻿


Section 3.8


Finding Maximum and Minimum Values


13


                   2000


                   1500


                   1000


                   500


                                  5         10         15        20

                        Figure 3.8.11 Graph of S = 27r2 + 2000
                                                             r

shows the graph of S in the case V = 1000 cubic centimeters, in which case the surface
area is minimized when r is approximately 5.42 centimeters.

Problems

1. Find the minimum and maximum values, and their locations, for each of the following
    functions on the given intervals.
    (a) f(x)   x2 - 4 on [-3,4]                 (b) f(x) =x3 - 3x on [-2, 4]
    (c) g(t)  cos(t) - sin(t) on [-7, w]       (d) f (t) = 2t3 + 3t2 - 36t on [-4, 3]
    (e) g(x) = x2 cos(x) on [-2, 2]          (f) f (t)  cos(t) + sin(2t) on [0, 7]
    (g) f (t) = t2 sin(t) on [0, 7]
 2. A farmer wishes to fence in a rectangular field with 600 yards of fencing. What should
    the dimensions of the field be in order to maximize the area of the field?
 3. A farmer wishes to fence in a rectangular field, using a straight river for one side, with
    500 yards of fencing. What should the dimensions of the field be in order to maximize
    the area of the field?
 4. Suppose the farmer in Problem 2 wishes to divide his field into two equal rectangular
    fields using a fence parallel to two of the sides. What should the dimensions of the
    field be in order to maximize the combined areas of the fields?
 5. When a potter sells his pots for p dollars apiece, he can sell D(p)  ,2500 - p2 of them.
    Suppose the pots cost him $6.00 apiece to make. What price should the potter charge
    in order to maximize his profit?
 6. A wire of unit length is to be cut into two pieces. One of the pieces will be used to
    form a square, the other a circle. Where should the wire be cut in order to maximum
    the total area enclosed by the square and the circle? Where should it be cut in order
    to minimize the total area enclosed by the square and the circle?


﻿


14


Finding Maximum and Minimum Values


Section 3.8


7. Find all local maximums and minimums, and their locations, for the following func-
    tions.

    (a) f (x) = 3x2 + 5                        (b) f (t) = t4 + 3t2
    (c) g(t) = t3 + 3t2                        (d) g(x) = sin(x) cos(x)
                  1                                         cc2
    (e) f(x) 1+c2                              (f) h(x) 1+     2

    (g) g(X) =5 -cX3                          (h) f(t) =t4 -2t3
    (i) g(t) = 3t5 - 5t4                       (j) f()    1 + 3x2

 8. Find the second derivative of each of the following functions.
    (a) f (x) = 3x2 + 2x - 3                   (b) g(t) = 13t4 - 3t3 + t2 - 45
              1
    (c) s =    1                              (d) g(x) = sin2(3x)
            2t - 1
    (e) x = sin(2t) cos(4t)                    (f) y = x2 tan2(3x)

 9. Find the maximum value of
                                              cc-i
                                      f(xc)=
                                               cc2
    on the interval (0, oc). Does f have a minimum value on (0, oc)?

10. We found the minimum value of

                                       f(t)   sin(t)


    on (0,27r) in an example. Does f have a maximum value on (0,27r)?

11. A farmer wishes to construct a rectangular storage bin with a volume of 1000 cubic
    feet. Both the top and the bottom of the bin are to be squares. Find the dimensions
    of the bin which will minimize its surface area.

12. Suppose the bin in Problem 11 does not require a bottom. Find the dimensions of the
    bin which minimize surface area in this case.

13. Suppose the material for the top and the bottom of the bin in Problem 11 costs $2.00
    per square foot while the material for the sides costs $3.00 per square foot. Find the
    dimensions of the bin which minimize its cost.
14. A metal can in the shape of a right circular cylinder without a top is to be made so
    that it holds 100 cubic centimeters. Find the dimensions of the can which minimize
    its surface area.

15. Suppose the material for the top and bottom of a can in the shape of a right circular
    cylinder costs $0.04 per square centimeter and the material for the side costs $0.02 per
    square centimeter. If the can must hold 1000 cubic centimeters, for what dimensions
    is the cost of the can minimized?


﻿


Section 3.8


Finding Maximum and Minimum Values


15


16. A metal can in the shape of a right circular cylinder is to be made so that it holds 500
    cubic centimeters. Suppose the top and bottom of the can are cut from square pieces
    of metal, with the scraps being discarded afterwards. Assuming there is no waste in
    making the side of the can, find the dimensions of the can which minimize the amount
    of material needed to make it.
17. Show that the rectangle with maximum area for a given perimeter P is a square.
18. Show that the rectangle with minimum perimeter for a given area A is a square.
19. A quality control engineer is studying the failure rate of a certain type of beam under
    stress. Test beams are put under a steady stress for 1000 hours. If p is the probability
    that the beam passes the test, then 1 - p is the probability that the beam fails the
    test. Suppose that in 50 trials, only 5 beams fail the test.
    (a) If L(p) is the probability of the observed sequence of successes and failures, explain
        why
                                       L(p) = p45(1 -p)
        for 0  p    1.
    (b) If p is the value of p which maximizes L(p) on [0, 1], show that p = 0.9.
20. According to genetic theory, if a parent provides the gene A with probability 0 and the
    gene a with probability 1 - 0, then the offspring is of genotype AA with probability
    02, of genotype Aa with probability 20(1 - 0), and of genotype aa with probability
    (1 - 0)2. Suppose that in a sample of 100 people, 31 were observed to be of type AA,
    48 of type Aa, and 21 of type aa. Let L(0) be the probability of observing this specific
    sequence of genotypes.
    (a) Explain why
                                    L(0) = 2480110(1 - 9)90
        for 0 < 0 < 1.
    (b) Find the value of 0 which maximizes L(0) on [0, 1]. What are the corresponding
        values for the probabilities of AA, Aa, and aa?
21. In the final example of this section, we showed that the surface area of a right circular
    cylindrical can is minimized when the height of the can is equal to the diameter of the
    can. Check out a local supermarket to see how many cans satisfy this condition. What
    other considerations might be important in the design of a can?
22. We have seen that if x(t) is the position of an object moving on a straight line at time
    t, then the velocity of the object is given by v(t)  Pt) and the acceleration is given
    by a(t) = v(t). Hence a is the second derivative of x; that is, a(t) = z(t). Suppose an
    object is oscillating at the end of a spring so that its position at time t is x 3 3sin(wrt).
    (a) Find v(t).
    (b) Find a(t).
    (c) Discuss the behavior of the object over the interval [0, 2], taking into account the
        values of cc(t), v(t), and a(t).


﻿


Section 3.9


       Differential Equations         The Geometry of Graphs


In Section 2.1 we discussed the graph of a function y = f(x) in terms of plotting points
(x, f(x)) for many different values of x and connecting the resulting points with straight
lines. This is a standard procedure when using a computer and, if the function is well
behaved and sufficiently many points are plotted, will produce a reasonable picture of the
graph. However, as we noted at that time, this method assumes that the behavior of the
graph between any two successive points is approximated well by a straight line. With a
sufficient number of points and a differentiable function, this assumption will be reasonable.
Yet to understand a graph fully, it is important to have alternative techniques to verify
the picture at least qualitatively. We have already developed several important aids for
understanding the shape of a graph, including techniques for determining the location of
local extreme values and techniques for finding intervals where the function is increasing
and intervals where it is decreasing. In this section we will use this information, along
with additional information contained in the second derivative, to piece together a picture
of the graph of a given function.
    To see the importance of the second derivative, consider the graphs of f(x) = x2 and
g(x) = V   on the interval (0, oc). Now


f'(x)


= 2x


and


      1
g()2x'


5

4

3

2

1


1        2        3       4        5


Figure 3.9.1 Graphs of y


= x2 and y=


x


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Geometry of Graphs


Section 3.9


1


Figure 3.9.2 Graphs of y = x2 and y = -x2 on (-oc, oc)


so f'(x) > 0 and g'(x) > 0 for all x in (0, oc). Thus f and g are both increasing on (0, oo).
However, the graphs of f and g, as shown in Figure 3.9.1, are dramatically different. The
graph of f is not only increasing, but is becoming steeper and steeper as x increases,
whereas the graph of g is increasing, but flattening out as x increases. In other words, f'
is itself an increasing function, causing the rate of growth of the function to increase with
x, while g' is a decreasing function, resulting in a decrease in the rate of growth of g and
a flattening out of the graph. In the terminology of the next definition, we say that the
graph of f is concave up on (0, oc) and the graph of g is concave down on (0, oo).

Definition Suppose f is differentiable on the open interval (a, b). If f' is an increasing
function on (a, b), then we say the graph of f is concave up on (a, b). If f' is a decreasing
function on (a, b), then we say the graph of f is concave down on (a, b).

    Of course, to check for the intervals where f' is increasing and the intervals where f' is
decreasing, we consider where f", the derivative of f', is positive and where it is negative.

Proposition Suppose f is twice differentiable on the interval (a, b). If f"(x) > 0 for all
x in (a, b), then the graph of f is concave up on (a, b); if f"(x) <0 for all x in (a, b), then
the graph of f is concave down on (a, b).

Example     Two basic examples to keep in mind are f(x) = x2 and g(x) =-2. Since
f"(x) = 2 > 0 and g"(x) = -2 < 0 for all values of x, the graph of f is concave up on
(-oc, oc) and the graph of g is concave down on (-oc, oc). See Figure 3.9.2.

Example     Consider g(t) = t3. Then g"(t) = 6t, so g"(t) <0 when t < 0 and g"(t) > 0
when t > 0. Hence the graph of g is concave down on (-oc, 0) and concave up on (0, oc).
Notice in Figure 3.9.3 how, even though g is increasing on (-oc, oc), the change in concavity
at (0, 0) changes the shape of the graph.

Definition A point on the graph of a function f where the concavity changes from up
to down or from down to up is called an inflection point.

Example In our previous example, (0, 0) is an inflection point for the graph of g(t) = t3.


﻿


Section 3.9


The Geometry of Graphs


3


2


1


-3     -2


1      2       3


Figure 3.9.3 Graph of g(t)


                       1
Example     Let f(x) = -. Then


and


          1
f'(X) =X2


   f 2
f"()   = 23.


Hence f'(x) <0 on both (-oo, 0) and (0, oo), while f"(x) < 0 when x < 0 and f"(x) > 0
when x > 0. Thus f is decreasing on both (-oc, 0) and (0, oc), but the fact that the graph
is concave down on (-oc, 0) shows up in the way the steepness of the graph increases as x
approaches 0 from the right, while the fact that the graph is concave up on (0, oc) shows
up in the way the graph flattens out as x increases toward oc. See Figure 3.9.4. Also
note that, although the concavity of the graph of f changes, the graph does not have an
inflection point since f is not defined at 0.


4

2


-4                           2         4


Figure 3.9.4 Graph of f(x)


1
-


﻿


4


The Geometry of Graphs


Section 3.9


    Note that if (c, f(c)) is an inflection point on the graph of a function f, then either
f"(c) = 0 or f" is not defined at c. However, the converse does not hold. For example, if
f(x) = x4, then f"(0) = 0, even though f"(x) = 12x2 is positive for all x in both (-oc, 0)
and (0,oo).
    From the foregoing, it is clear that f' and f" provide enough information to obtain a
good understanding of the shape of the graph of f. Specifically, to sketch the graph of f,
we use the first derivative to find (1) intervals where f is increasing, (2) intervals where f
is decreasing, and (3) locations of any local extreme values; we use the second derivative
to find (1) intervals where the graph of f is concave up, (2) intervals where the graph f
is concave down, and (3) any inflection points. Combining this information with a few
values of the function, the location of any asymptotes, and information on the behavior of
f (x) as x goes to -oc and as x goes to oc, we can piece together a qualitatively accurate
picture of the graph of f.
Example     Consider f(x) = 3x2 - x3 + 2. Then

                             f'(x) = 6x - 3x2 = 3x(2 - x),

so the critical points of f are 0 and 2. Since f'(-1) = -9 < 0, f'(1) = 3 > 0, and
f'(3) = -9 < 0, f is decreasing on the intervals (-oc, 0) and (2, oo) and increasing on
(0, 2). Moreover, this shows that f has a local minimum of 2 at x = 0 and a local maximum
of 6 at x =2.
    Next, we have
                               f"(x) = 6 - 6x = 6(1 - x),
so f"(x) = 0 when x = 1. Now 1 - x > 0 when x < 1 and 1 - x < 0 when x > 1, so
f"(x) > 0 on (-oc, 1) and f"(x) < 0 on (1, oc). Hence the graph of f is concave up on
(oo, 1)and concave down on (1, oc), and (1, 4) is an inflection point.
    Combining this information with the values f(-1) = 6, f(3) = 2,


            lim  f (x) =  lim (3x2 - x3 + 2) =   lim  x3 (   - 1 + 2I=  ,
              x-o -oo                          x-~oo               ccX3J

    and
             lim f (x) = lim (3x2 -cx3 + 2) = lim x3  - 1 + 2  =-oo
             x-~oo      x-~oo                 x-o      X        cc3j

we can easily draw a graph which, even though we are only plotting five points (the two
local extreme values, the inflection point, and one point on each side of these points),
captures the shape of the graph of f very well. See Figure 3.9.5.
Example Consider g(cc) =12cc5 + 15cc4 - 40cc3 - 10. Then

         g'(cc) =60cc4 + 60cc3 - 120cc2 =60cc2(cc2 + cc - 2) =60cc2(cc + 2)(cc - 1),

implying that g has three critical points, namely, cc= -2, cc= 0, and cc= 1. Now 60cc2 ;> 0
for all values of cc; cc + 2 < 0 when cc < -2 and cc + 2 > 0 when cc> -2; and cc - 1 < 0


﻿


Section 3.9


The Geometry of Graphs


5


-2


2


4


-2.5


                      Figure 3.9.5 Graph of f(x) = 3x2 - x3 + 2


when x < 1 and x - 1 > 0 when x> 1. Thus g'(x) > 0 when x < -2 and when x> 1,
and g'(x) <0 when -2 <cc < 0 and when 0 < c < 1. So g is increasing on (-oc, -2) and
(1, oc), and g is decreasing on (-2, 0) and (0, 1). In particular, g has a local maximum
of 166 at x = -2 and a local minimum of -23 at x = 1. Although g has neither a local
maximum nor a local minimum at the critical point 0, for drawing the graph of g it is
important to note that the slope of the curve is 0 at (0, -10).
    Next,
                 g"(x) = 240x3 + 180x2 - 240x = 60x(4x2 + 3x - 4),
so g"(x) = 0 when x = 0 and when x2 + 3x - 4 = 0. Using the quadratic formula, the
latter equation has solutions

                                  -3 - 73_
                             X -=3      /   =-1.4430
                                      8
and
                                  -3 + 73
                              cc=            - =0.6930,
                                      8'
rounding to four decimal places. Now 4x2 + 3x - 4 < 0 only when x is between the two
roots -1.4430 and 0.6930. Since 60x < 0 when x < 0 and 60x > 0 when x > 0, we
may conclude that g"((x) < 0 for x < -1.4430 and 0 < c < 0.6930, and g"((x) > 0 for
-1.4430 < xc <0 and x > 0.6930. Hence the graph of g is concave down on (-oc, -1.4430)
and (0, 0.6930) and concave up on (-1.4430, 0) and (0.6930, oc). In particular, g has three
inflection points: (-1.4430, 100.1459), (0, -10), and (0.6930, -17.9349).
    Adding to this information the values g(-3) = -631, g(2) = 294,

                     lim  g(x) =  lim (12x5 + 15x4 - 40x3 - 10)
                     x- -oo      x- -o
                                                15 -40 -10
                               - lim cc (12+         40    -
                                 x -c           cc Xc2c5
                                 -,00


﻿


6


The Geometry of Graphs


Section 3.9


400

300

200


-4


-2    -1


2      3


-100

-200

-300


Figure 3.9.6 Graph of g(x) = 12x5 + 15x4


40x3 - 10


and


lim g(x) = lim (12x5 + 15x4
x-oo        x-oo
                          15
            lim xs512+
            x-oo          x


40x3 - 10)

40    10
X2 X5


                               = oo,

we can now sketch the graph of g. See Figure 3.9.6.

Example For our final example, consider

                                            t2
                                    h(t) = .


Then


h'(t)   (t2 - 1)(2t) - (t2)(2t)
              (t2 - 1)2


   2t
(t2 - 1)2'


so h'(t) = 0 when 2t = 0. Thus h has one critical point, t = 0. However, we must also
take into consideration the two points where h and h' are not defined, namely, t = -1 and
t = 1. Now (t2 - 1)2 > 0 for all t, so the sign of h' is determined by the sign of -2t. Thus
h'(t) > 0 when t < -1 and when -1 < t < 0, and h'(t) < 0 when 0 < t < 1 and when
t > 1. In other words, h is increasing on (-oc, -1) and (-1,0), and h is decreasing on
(0, 1) and (1, oc). From this we see that h has a local maximum of 0 at t = 0. For the
second derivative, we have


h"(t)   (t2 - 1)2(-2) - (-2t)(2(t2 - 1)(2t))
                     (t2 -114


-2(t2 - 1) + 8t2
    (t2 - 1)3


6t2 + 2
(t2 -


Since 6t2 + 2 > 0 for all values of t, it follows that h"(t) # 0 for all t. However, as with the
first derivative, we need to consider the points t = -1 and t = 1 where h" is not defined.
Now t2 - 1 <0 only when -1 <t < 1, so h"(t) <0 when -1 <t < 1 and h"(t) > 0 when


﻿


Section 3.9


The Geometry of Graphs


7


-3     -2      -1
                    -2
                    -4
                    -6
                    -8


1


2       3


Figure 3.9.7 Graph of h(t)


  t2
t2-_1


t < -1 and when t > 1. Hence the graph of h
on (-oc, -1) and (1, oc). Note, however, that
    Since h is not defined at t = -1 and t =
at these points. We have


is concave down on (-1, 1) and concave up
there are no points of inflection.
1, we need to check for vertical asymptotes


lim h(t)
---


lim-  -t 1
t --it- t2_


O,


lim  h(t) =  lim
-1+         t--i+ t2 - 1

lim  h(t) =  lim =
t-          t-1- -t2  1


  lim h(t) = lim
  t-1l+      t-+ t2 - 1


-oo0


-0,


O,


and


showing that the graph of h has vertical asymptotes at t


-1 and t = 1. Finally,


lim h(t)


lim
t- -o    - 1


= lim      1
t--oo 1  1
           t2


        1
 lim        =
 t-°° 1   1


1


and


                 2
lim h(t)   lim


1


show that the graph of h has a horizontal asymptote at y = 1. With all of this geometric
information, we may now draw the graph of h, as shown in Figure 3.9.7.


﻿


8


The Geometry of Graphs


Section 3.9


Problems

1. Discuss the geometry of the graphs of each of the following functions. That is, find
    the intervals where the function is increasing and where it is decreasing, find the
    intervals where the graph is concave up and where it is concave down, find all local
    extreme values and where they are located, find all inflection points, find any vertical
    or horizontal asymptotes, and use this information to sketch the graph.


(a) f (x)
(c) g(c)W
(e) f(x)
(g) h(x)


x2 -x
x3 + 3x2
cx3 - 3x
x5 - x3


              1
 (i) g(z)   z - 1

 (k) f (x) =x4 - 2x3
              t
(m) h(t) =  2t

(o) f ()=        2
            2t + 1
 (q) x(t)   t -


(b) g(t) = 3t2 + 2t - 6
(d) f (t) = t4 + 2t2
(f) g(x) = 3x5 - 5x3
(h) f (x) = 3x5 - 5x4
             1
 (j) y(t) - t2 +
             t
 (1) h(t) = +2

(n) g(x)        c2
            1+3x

(p) f (X) c21
             z2
(r) f(z)  2
             z-4


2. Suppose the function f has the following properties:

                      f(0) = 0
                      f'(x) > 0 for x in (-oo,2)
                      f'(x) < 0 for x in (2,oo)
                      f"(x) <0 for x in (-2, 6)
                      f"(x) > 0 for x in (-oc, -2) and for x (6, oo)
                        lim f (x) = -2

                        lim f (x) = 0


   Sketch the graph of a function satisfying these conditions.


3. Suppose f (0) = 0 and f'(x) = x2


1.


(a) Sketch what the graph of f must look like.
(b) Graph f' on the same axes with f.
(c) Is there more than one function f which satisfies these conditions?


﻿


Section 3.9


The Geometry of Graphs


9


4. Suppose f (0) = 0 and f'(x) = x3 + X2 - 6x.
   (a) Sketch what the graph of f must look like.
   (b) Graph f' on the same axes with f.
   (c) Is there more than one function f which satisfies these conditions?
                                1
5. Suppose g(1) = 0 and g'(t) -

   (a) Sketch what the graph of g must look like on (0, oc).
   (b) Graph g' on the same axes with g.
   (c) Is there more than one function g which satisfies these conditions?
6. Suppose f(0) = 1 and f'(x) = f(x). What must the graph of f look like? Is this
   enough information to determine the graph of f?


﻿


Section 4.1


       Differential Equations         The Definite Integral


As we discussed in Section 1.1, and mentioned again at the beginning of Section 3.1, there
are two basic problems in calculus. In Chapter 3 we considered one of these, the problem
of finding tangent lines to curves in the plane; we are now ready to turn to the second,
quadrature, the problem of finding the area of a region in the plane. Although at first these
problems would seem to have no connection, in Section 4.3 we shall see that Fundamental
Theorem of Calculus relates them in an interesting and useful way. This theorem, first fully
utilized by Newton and Leibniz, reveals that the problem of quadrature involves reversing
the process of differentiation; as a consequence, the facility we developed in Chapter 3 for
handling derivatives will be very helpful in many basic quadrature problems.


y =f(x)


a


b


Figure 4.1.1 Region beneath the graph of y


f(x) and over the interval [a, b]


    As illustrated in Figure 4.1.1, our basic example for studying quadrature will be the
problem of finding the area of a region R in the plane which is bounded above by the
graph of a continuous function f and below by an interval [a, b] on the x-axis. Later we
will see how to extend our techniques to more complicated planar regions. Recall that in
Section 1.1 we considered the problem of finding the area of the unit circle. In that case,
we attacked the problem by approximating the area of the circle by the area of inscribed
regular polygons, which were themselves divided into triangles. We used these to find the
area of the circle by taking the limit of the areas of the inscribed polygons as the number


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Definite Integral


Section 4.1


6


6


1           1     2           -1          1     2


         Figure 4.1.2 Inscribed and circumscribed rectangles for f(x) = x2 + 1


of sides went to infinity. Here we will see that it is sufficient to use rectangles, rather
than triangles, as our units of approximation. That is, we will approximate the area of
the desired region by the area of rectangles and then ask about the limit as the number of
rectangles used in the approximation goes to infinity. We begin with an example.

Example     Consider the region R beneath the graph of the function f(x) = x2 + 1 and
above the interval [-1, 2] on the x-axis. Let A be the area of R. If R1 is the rectangle
with base on the interval [-1, 2] and height f(2) = 5, then, since 5 is the maximum value
of f on [-1, 2], R1 contains R. We call R1 a circumscribed rectangle for the region R.
Hence the area of R is less than the area of R1, showing that A < 15. Similarly, if R2 is
the rectangle with base on the interval [-1, 2] and height f(0) = 1, then, since 1 is the
minimum value of f on [-1, 2], R contains R2. We call R2 an inscribed rectangle for the
region R. Hence the area of R is greater than the area of R2, showing that A > 3. See the
figure on the left in Figure 4.1.2.
    At this point we know that
                                     3 < A < 15.

To improve our approximations for A, we begin by subdividing the interval [-1, 2] into two
equal intervals, namely [-1, 0.5] and [0.5, 2]. If Ai is the area of the region beneath the
curve over the interval [-1, 0.5], we can construct inscribed and circumscribed rectangles
as we did in the last paragraph and obtain bounds for the area of A1. Indeed, the rectangle
with base on [-1, 0.5] and height f(-1) = 2 circumscribes this region, while the rectangle
with base on [-1, 0.5] and height f(0) = 1 is inscribed in it. Hence we have

                                     3
                                     - <Ai <3.
                                     2 -


﻿


Section 4.1


The Definite Integral


3


                       6                              6

                       5                              5

                       4                              4

                       3                              3

                       2-                             2


                 -1           1     2           -1           1     2

         Figure 4.1.3 Inscribed and circumscribed rectangles for f(x) = x2 + 1


Moreover, the region beneath the curve over the interval [0.5, 2] is circumscribed by a
rectangle of height f(2) = 5 and has inscribed within it a rectangle of height f(0.5) = 1.25.
So if A2 is the area of this region, we have
                                    15         15
                                    8          2
Since
                                    A =A1+A2,
putting these last two results together gives us
                                    27        21
                                    8          2
an improvement on our previous approximation. See the figure on the right in Figure 4.1.2.
    To improve our approximation further, divide [-1, 2] into three equal intervals: [-1, 0],
[0, 1], and [1, 2]. You should check that the heights of the inscribed rectangles over these
intervals are 1, 1, and 2, respectively. Since each rectangle has a base of length 1, we have

                           A > (1) (1) + (1) (1) + (2) (1) = 4.

Moreover, the heights of the circumscribed rectangles are 2, 2, and 5, respectively, and so

                           A < (2)(1) + (2)(1) + (5)(1) = 9.

Hence we now have
                                      4<A<9.


See the figure on the left in Figure 4.1.3.


﻿


4


The Definite Integral


Section 4.1


    It is clear that we can approximate A using inscribed and circumscribed rectangles for
any number of intervals. For example, you might check that if we use six intervals of equal
length we would have


   A ;>\4     \2   + (1)\2    + (1) \/+\4        \2   + (2) \/+\4)\2             8

and

   A <; (2) \/+ \/\/+ \/\)+(2 (2)                     + (   )()+ (5) =

showing that
                                   4.875 <A < 7.375,

(see the figure on the right in Figure 4.1.2). Continuing in this manner, subdividing
the interval [-1, 2] into smaller and smaller intervals, we would expect that we could
approximate A to any desired level of accuracy. Moreover, we would expect that the area
of the inscribed rectangles would increase toward A as the number of intervals increases,
and that the area of the circumscribed rectangles would decrease toward A. Put another
way, we might think of the area A as the unique number which is at once larger than the
area of any set of inscribed rectangles and smaller than the area of any set of circumscribed
rectangles. We will use this idea as the basis for our definition of the definite integral.

The definite integral
We now want to take the ideas of the previous example and develop a general procedure
which, when applied to the appropriate function, will yield the area of certain types of
regions in the plane. To do so, we require some preliminary terminology and notation.
    Let f be a function defined on an interval [a, b]. We will not require that f be positive
on [a, b], although it will be necessary to require f(x) > 0 for all x in in order to talk
about the area between the graph of f and the interval [a, b], as in the previous example.
However, we will assume that f is bounded on [a, b]; that is, we assume there exist numbers
m and M such that m < f(x) < M for all x in [a, b]. In particular, by the Extreme
Value Theorem, f is bounded if f is continuous on [a, b]. We will return to the problem of
unbounded functions in Section 4.7.
    We call a set P = {xo, zi, ... , xn} a partition of the interval [a, b] if

                           a =gzo <x1 < X2 < ... < zn = b.

Such a partition P divides [a, b] into n intervals, [x2_1, xi], of lengths


where i =1, 2, 3, . .. , n. For each such interval [zx_1, zi], let Mi be the smallest number such
that f(x) < My for all x in [zx_1, zi] and let mi be the largest number such that f(x) ;> mi
for all x in [zx_1, zx]. Note that if f is continuous on [a, b], then Mi is the maximum value


﻿


Section 4.1


The Definite Integral


5


of f on [x_1, x] and m2 is the minimum value of f on [xi_1, xi], both of which are
guaranteed to exist by the Extreme Value Theorem. If f is not continuous, properties
of bounded sets of real numbers, alluded to in our discussion of bounded sequences in
Section 1.2, nevertheless guarantee the existence of the values MZ and mi. Also, note
that if f(x) > 0 for all x in [xi1, z], then, in the language of our previous example, the
rectangle with base [xi-1, xi] and height Mi is a circumscribed rectangle and the rectangle
with base [x_1, xi] and height m2 is an inscribed rectangle.
    Now let


               U(f, P) = M1Ax1 + M2Ax2+ -- ... + ManAx     =    MiAx,           (4.1.1)
                                                             i=1

the upper sum of f with respect to the partition P, and


               L(f, P) = m1Ax1 + m2Ax2 --... -+ m      onx =    miAx,           (4.1.2)
                                                             i=1

the lower sum of f with respect to the partition P. Note that we always have

                                  L(f, P)   U(f, P).                            (4.1.3)

Also, if f(x) > 0 for all x in [a, b], then U(f, P) is the sum of the areas of the circumscribed
rectangles for the partition P and L(f, P) is the sum of the areas of the inscribed rectangles.
In that case, if A is the area beneath the graph of f and above the interval , we would
expect that we could make U(f, P) and L(f, P) arbitrarily close to A. This would imply
that A is the only number with the property that

                                L(f, P) < A < U(f, P)                           (4.1.4)

for all partitions P. This is the motivation for the following definition.
Definition Using the above notation, we say a function f is integrable on an interval
[a, b] if there exists a unique number I such that

                                L(f, P)   I < U(f, P)                           (4.1.5)

for all partitions P of [a, b]. If f is integrable on [a, b], we call I the definite integral of f
on [a, b], which we denote
                                          fb
                                    I =    f (x)dx.                             (4.1.6)

Example Consider again our example of finding the area of the region beneath the graph
of f(x) = -21+ 1 and above the interval [-1, 2] on the x-axis. Let F. denote the partition
using n +1-1 equally spaced points (giving us n intervals of equal length). For examples,


P2 ={-1,0.5, 2}


﻿


6


The Definite Integral


Section 4.1


and
                            P6 ={-1, -0.5,0,0.5,1,1.5, 2}.
Our work above shows that, in our current notation,

                                   U(f, P6) = 7.375

and
                                   L(f, P6)   4.875.

Using 100 intervals, and a computer to ease the computations, we find that

                                  U(f, P100) = 6.075

and
                                  L(f, P100) = 5.925,

where the results have been rounded to three decimal places. This shows that if f is
integrable on [-1, 2], then
                                      2
                            5.925   f(x2 + 1)dx < 6.075.
                                     -1

Of course, we expect f to be integrable, and for the value of the definite integral to be the
sought for area under the graph.
    It is not easy to verify directly from the definition that a given function is integrable on
some interval. However, it may be shown that any continuous function is integrable. The
reasons for this are rather technical, but we can give some feeling for why this should be
so. Suppose f is continuous on [a, b] and let P,= {xo, xi, x2, ... , xn} denote the partition
of [a, b] using n +1 equally spaced points. Let MZ and m2 be as defined above, and let

                                  Ax= Ax2=b - a
                                                 n

be the length of the intervals [x2-1, xi], i = 1, 2, 3, ... , n. Given any number c > 0, we
can choose n large enough (equivalently, Ax small enough) so that MZ - m2 < c for
i = 1, 2, 3, ... , n. This fact is a consequence of the continuity of f on [a, b], although it
requires a deeper property of continuous functions on closed intervals known as uniform
continuity. We then have

                            0 < U(f,Pn) - L(f,Pn)
                                 n           n


                                 i=1               i=1
                               =neAx = (b - a).


﻿


Section 4.1


The Definite Integral


7


                          5

                          4


                          2

                          1


                     -1           1    2      3     4     5     6
                Figure 4.1.4 Region beneath y = 3 over the interval [0, 5]


Since c may be made arbitrarily small, it follows that the difference between upper sums
and lowers sums may be made arbitrarily small, and hence that there must be only one
number which is between the upper and lower sums for all possible partitions.

Proposition If f is continuous on [a, b], then f is integrable on [a, b].
Example     We now know that f(x) = x2 + 1 is integrable on [-1, 2].

    Although our motivation for this section has been the computation of area, we have
not actually defined the term. We do so now for the special case we have been considering.

Definition Given an integrable function f with f (x) > 0 for all x in an interval [a, b], let
R be the region in the plane bounded above by the curve y = f(x), below by the interval
[a, b] on the x-axis, and on the sides by the vertical lines x = a and x = b. Then we define
the area A of R to be
                                          b
                                   A =     f(x)dx.                             (4.1.8)


Example Of course, we should verify that the above definition of area agrees with our
previous notion of area. For example, if f(x) = 3 for all x in [0, 5], then the region beneath
the graph of f and above the x-axis is a rectangle with base of length 5 and height of 3,
as shown in Figure 4.1.4. Hence we should have

                                      5
                                        3dz = 15.
                                      O

To verify this, let P = {xo, c1, x2,... , ctn} be any partition of [0, 5]. Then on any interval
[i1, cci], i =1, 2, 3, . .. , n, the maximum value of f is Mi  3 and the minimum value of
f is mi   3. Hence


                U(f, P) =L(f, P) =Z3Aci =3ZAcci =(3)(5) =15,
                                       i=1 i=1


﻿


8                           The Definite Integral                          Section 4.1

                         10

                         8

                         6

                         4
                           1


                         -2


               Figure 4.1.5 Region beneath y = 2x over the interval [0, 4]


where we have used the fact that the sum of the lengths of the partition intervals must
equal the length of the entire interval . Thus I = 15 is the only number satisfying

                                L(f, P) < I <;U(f, P)

for all partitions P, and so
                                      J(5
                                        3dx = 15,
                                      O
as expected.
    Note that the previous example could be generalized to show that for any constant c
and any interval [a, b],

                                      cdx = c(b - a).                           (4.1.9)


Example     To verify another previously known area, consider the function f(x) = 2x on
the interval [0, 4]. Then the region beneath the graph of f and above the interval [0, 4] on
the x-axis is a triangle with base of length 4 and height 8, as shown in Figure 4.1.5. Thus
it has area
                                     1
                                     a(4)(8) = 16,

and so we should have
                                      4
                                        2xdx = 16.

To verify this, let P ={x0, zi, xc2, . .. , xn} be a partition of [0, 4] and let mi and Mi be the
minimum and maximum values, respectively, of f on [xi_1, zi], i =1, 2, 3, . .. , n. Since f
is an increasing function on [0, 4], we have mi f(zx_1) and Mi   f(zci). Thus


                                         i=1


﻿


Section 4.1


The Definite Integral


9


and
                              U(f, P) f (xzi)Axzi.
                                        i=1
We will now use a technique which will be useful in the proof of the Fundamental Theorem
of Calculus in Section 4.3. Let F(x) = x2. Then F'(x) = 2x, so F is an antiderivative
of f. By the Mean Value Theorem, for every interval [x_1, xi] there exists a point c2 in
[x-1, Xi] such that


                          F(xi) - F(xi_1) = F'(ci)  f (ci).
                             Xi - x1
Now x2 - xi - Ax2, so from (4.1.10) we obtain


(4.1.10)


f(ci)Axi = F(xi) - F(zi_1).


(4.1.11)


Moreover, f(zx_1) < f(c2)  f(x), so


           n               n              n
L(f, P)      f (i_1)Axi   Zf(ci),a   Z f (i)Axi = U(f, P).
          i=1              i=1           i=1


(4.1.12)


But, using (4.1.11),


   f (ci)A xi
i=1


n
5(F(xi) - F(zi_1))
i=1
(F(xi) - F(xo)) + (F(x2) - F(xi)) + (F(x3) -
  + (F(xn) - F(xn_1))
-F(xo) + (F(xi) - F(xi)) + (F(x2) - F(x2))
  + (F(xi) - F(xn_1)) + F(xn)
F(xn) - F(xo).


+ (F(x3) - F(x3)) + -


Now xo = 0 and xn = 4, so

                    F(xn) - F(xo) = F(4) - F(0) = 16 - 0 = 16.

It now follows from (4.1.11) that, for any partition P,


L(fP)  <16  <U(fP).


(4.1.13)


Since we know that f is integrable on [0, 4] (it is continuous on [0, 4]), the definite integral
of f is the only number which satisfies the inequalities in (4.1.13) for any partition P.
Hence we must have


4n


2x   16,


in agreement with our geometric argument above.


﻿


10


The Definite Integral


Section 4.1


                      2


                      1.5


                      1


                      0.5


                               0.5       1       1.5      2
            Figure 4.1.6 Region beneath y =v4 - x2 over the interval [0, 2]


    As we proceed with our study of integration we shall from time to time have occasion
to verify that areas computed using a definite integral are consistent with areas computed
by other geometric means. At the same time, we shall take this consistency as a given.
For example, we shall accept that

                                   2
                                       4-x2 dx=,
                                  0

since the region beneath the curve y =v4 - x2 and above the interval [0, 2] is one-quarter
of a circle of radius 2 centered at the origin, as shown in Figure 4.1.6.

Example It is important to realize that not all bounded functions are integrable. As an
example, consider the function


                       f(x)   {Ii   if x is a rational number,
                                0, if x is an irrational number.


For example, f (0.12345) = 1 and f (j   = 0. Let P = {xo, xi, x2, ... , xn} be a partition
of [0, 1]. Since every interval [x2_1, xi], i= 1, 2,3, ..., n, contains both rational and irra-
tional numbers, the minimum value of f on [x_1, x] is m2 = 0 and the maximum value
of f on [x2_1, x] is MZ = 1. Thus


                               L(f, F) =    miAzx     0
                                         i=1


﻿


Section 4.1


The Definite Integral


11


and
                                    n            n
                         U(f, P) =     MiAxi =      Axi = 1.
                                    i=1         i=1
Hence any number between 0 and 1 lies between L(f, P) and U(f, P) for any partition P.
Since there is not a unique such number, we conclude that f is not integrable on [0, 1].
    Computing a definite integral directly from the definition is usually a daunting task.
We shall take a first look at approximating definite integrals in this section, and then refine
these techniques in Section 4.2. In Section 4.3 we will look at the Fundamental Theorem
Calculus, a result which will, in certain cases, allow us to compute definite integrals exactly
with relative ease.

Riemann sums
Again let f be a function defined on an interval [a, b] and let P = {xo, zi, x2, ... , xn} be
a partition of [a, b]. Recall that in the definition of the upper and lower sums, Mi and
mi are chosen, in part, so that mi < f (x) < Mi for all x in [xi_1,xi], i = 1, 2, 3, ... , n.
It follows that if we choose values c1, c2, c3, ... , cn so that ci is in the ith interval of the
partition (that is, xi_1 < ci < xi), then

                                   mi < f(ci) < Mi                            (4.1.14)

for i = 1, 2, 3, ... , n, and so
                     n            n             n
          L(f, P) >=3miAxi < <Ef(ci)Axi <>         MiAxi =U(f, P).            (4.1.15)
                    i=1          i=1            i=1

If f is integrable, it may be shown that is always possible to choose a partition P so that

                                |U(f, P) - L(f, P)| <E                        (4.1.16)

for any specified c > 0. It follows that, when f is integrable, it is always possible to find,
for any given c > 0, partitions for which

                              fb n
                                fC(x)dz - >3f(c )Axz   < e(4.1.17)
                                         i=1

for any choice of the points c1, c2, c3,... , cn. In fact, it may be shown that if we let L be
the maximum length of the intervals [xi_1, xi], then it is possible to find a b6> 0 such that
(4.1.17) will hold for any partition with L < b.
    The sum


                                     i=1
is called a Riemann sum, after the German mathematician G. B. F. Riemann (1826-
1866). From what we have just seen, we may use Riemann sums to approximate definite


﻿


12


The Definite Integral


Section 4.1


integrals. We will consider two important special cases of Riemann sums here (we will
look at another in Section 4.2). First, to make calculations simpler, we will restrict to
partitions with intervals of equal length. As above, let Pn = {xo, xi, x2,... , xn} be the
partition of [a, b] using n + 1 equally spaced points and let


                                      Axzb-a
                                              n

be the length of the intervals [xi_1, xi], i = 1, 2, 3, ... ,n. Note that


                                      lim Ax=0.
                                      n--oo

Hence, if we choose points c1, c2, c3, ... , cn with x2_1 < c < x, then we have, for an
integrable f,
                              fb               n
                                f(x)dx    lim     f(c2)Ax.                     (4.1.19)
                                         n--oo
                                              2=1

In other words, we may approximate the definite integral fa f(x)dx to any desired level of
accuracy using Riemann sums
                                       n
                                         f(ci)Ax                               (4.1.20)
                                      i=1

with sufficiently large n. To do this efficiently requires specifying how the points c1, c2, c3,
..., cn are to be chosen. One method is to simply choose c2 to be the right-hand endpoint
of the interval [xi_1, x]. In that case, since the points in the partition are equally spaced,
we have
                           c=x1 = xo + Ax= a + Ax,
                           C2 =x2 =x1 + Ax =a + 2Ax,
                           c3 = x3 = x2+ Ax = a+ 3Ax,                          (4.1.21)


                           c=      =-n_1- Ax =a +-     nAx.

Using these points in (4.1.20), we have


                             Sf (ci)A x =Ax >3f(a +F iAxz).                    (4.1.22)


This approximation is known as the right-hand rule approximation for f1j f(x)dx.


﻿


Section 4.1                          The Definite Integral                          13

Definition If f is integrable on [a, b], the right-hand rule approximation for the definite
integral


using n intervals is given by

                                         n
                               AR= AZ       f (a + iAx),                       (4.1.23)
                                         i=1

where
                                     Ax    b-a
                                             n

    A similar rule is derived by using the left-hand endpoints of the intervals. In this case
we choose
                       Ci =x0 =0 =a,
                       C2 =X 1 = xo   Ax    a -- Ax,
                       C3 =X2 = x1i+Ax =a+ 2Ax,                                (4.1.24)


                       Cn = In-1 = n-+2 -- Ax= a -- (n - 1)Ax.


Definition If f is integrable on [a, b], the left-hand rule approximation for the definite
integral


using n intervals is given by

                                        n-1
                               AL= Ax >3 f (a + iAx),                          (4.1.25)
                                         i=o

where
                                     Ax- b-a


Example Returning to our first example, suppose f(x) x 2 +1 1 and let A be the area
of the region beneath the graph of f and above the interval [-1, 2]. With n1= 6, we have


                                           6        2


﻿


14


The Definite Integral


Section 4.1


6


6


      Figure 4.1.7 Left-hand and right-hand rule approximations for


and the left-hand rule approximation for A is


(x2 + 1)dx


         5
AL =EfQ


1i


(f(


-1) F


2) +f(o) +f (2) + f  + f


(2)


)


1 (    5
-I2+4+1+
1    4
2 \4)
43
   = 5.375.
 8


5       13\
5 +2+-4


See the figure on the left in Figure 4.1.7. Similarly, the right-hand rule approximation is


AR~ 2Z
        i=1

        Sf
        15
        S+


.(


  1-Fi
1 + -i


+ Ff(O)


+f (2) f (1)+


f ()


+f (2))


  5       13
4 +4/ 2+-


﻿


Section 4.1


The Definite Integral


15


                    1 55
                    2 4
                    55
                      =-6.875.
                    8
See the figure on the right in Figure 4.1.7. Recall that, for a partition of 6 intervals of
equal length, we computed a lower sum of 4.875 and an upper sum 7.375. Hence, as we
would expect for any Riemann sums, AL and AR lie between the lower and upper sums.
    Using n = 100 and a computer, we find AL = 5.955 and AR = 6.045, which again lie
between the lower sum of 5.925 and the upper sum of 6.075.
Example Now let A be the area of the region beneath the graph of
                                            1
                                      g (t)=-
                                            t
over the interval [1, 10], as shown in Figure 4.1.8. Then

                                          10 1
                                   A =      - dt.
                                         1 t
In Section 6.2 we will see that this integral is equal to the natural logarithm of 10, which, to
6 decimal places, is 2.302585. The following table summarizes the left-hand and right-hand
rule approximations for A:
             n            AR          AL        |A - ARI|   JA - AL
             10         1.960214    2.770214    0.342371    0.467629
             20         2.116477    2.521477    0.186108    0.218892
             40         2.205491    2.407991    0.097094    0.105406
             80         2.253003    2.354253    0.049582    0.051668
             160        2.277534    2.328159    0.025052    0.025574
             320        2.289994    2.315307    0.012591    0.012722


                  1.4
                  1.2
                  1
                  0.8
                  0.6
                  0.4


                                              1
              Figure 4.1.8 Region beneath y =- over the interval [1, 10]
                                              t


﻿


16


The Definite Integral


Section 4.1


As we should expect, the error in our approximations decreases as the number of subdi-
visions increases. What is more interesting to note is that, in this particular case, when
the number of subdivisions is doubled, the error committed by both the right-hand and
the left-hand rules decreases by a factor of, roughly, j. For example, this might lead us to
predict that the error in using 640 intervals would be about 0.0063; in fact, it turns out
to be 0.006312 for the right-hand rule and 0.006344 for the left-hand rule. This type of
behavior is typical for this method of approximation, a point we will come back to when
we investigate other methods of approximation in Section 4.2.

Properties of the definite integral
Since the integral of an integrable function may be computed as the limit of Riemann
sums, the basic properties of limits and sums hold true for integrals as well. In particular,
if f and g are integrable on [a, b] and k is any constant, then

                     ab                   b           b
                       (fg(x) +g(x))dx     f(x)dx   f    x)dx,(4.1.26)

                       (fb(x) - g(x))dx= fb f(x)dx -Jb g(x)dx,        (4.1.27)

                                            a Ia
and

                                fk f (x)dx = kbf(x)dx.                     (4.1.28)
                             a              a

Example We know that
                                         3
                                   f xdx = 2

(since the region under the graph of y = x over the interval [0, 3] is a triangle with base of
length 3 and a height of 3) and
                                     3
                                   f4dx =12

(either using (4.1.9) or the fact that the region under the graph of y = 4 is a rectangle
with base of length 3 and a height of 4), so it follows from (4.1.24) that


                    3(x+4)dx       fxdx +     dx = -+12-.
                    JO            o0o022


Example     The graph of g(t) =v1 - t2over the interval [-1, 11 is a semicircle of radius
1 centered at the origin, so

                                1 1w
                                521   t2 t  5 1 - 2 dt= 52


﻿


Section 4.1


The Definite Integral


17


Y = f(x)


1-f


c            b


Figure 4.1.9


b
  f(x)dx
a


c
  f (x)dx +
a


b
  f (x)dx
C


    Now suppose f is integrable on [a, b] and c is a point with a < c < b. It may be shown
that f is integrable on both [a, c] and [c, b]. Moveover, using partitions which include c,
we may write a Riemann sum for f over [a, b] as the sum of two Riemann sums, the first
over the interval [a, c] and the second over the interval [c, b]. After taking limits, it follows
that


jb


f(x)dz


f~ix)dxi +    xd.


(4.1.29)


If f (x) > 0 for all x in [a, b], we may think of (4.1.29) as saying that the area under the
graph of f over the interval [a, b] is equal to the area under the graph of f over the interval
[a, c] plus the area under the graph of f over the interval [c, b]. See Figure 4.1.9.
Example Suppose


f(x) =   3'


if0<x<1,
if1<x<2.


The region under the graph of f is shown in Figure 4.1.10. Now


 2
.f(x) dx


f(x)dx + 1f(x)dx


12
I dx +  3d+
0         1i


1       7
2+3 2.


    Technically, before applying (4.1.29) in the previous example we should have verified
that f is integrable on [0, 2]. Since f is not continuous on [0, 2], its integrability does not
follow from our previous results. However, f is an example of what is known as a piecewise
continuous function, which we will now define.


﻿


18


The Definite Integral


Section 4.1


  3.5
  3
  2.5
  2
  1.5
  1
  0.5


Figure 4.1.10


0.5         1         1.5        2

/2             1            2
   f (x~dx =    f (x~dx +    f (x)dx
 0            0            1


Definition
partition P
(xini zi), i
point xi, i =


A function is said to be piecewise continuous on an interval [a, b] if there is a
= {xo, xi, x2, ... , xn} of [a, b] such that f is continuous on each open interval
= 1, 2, 3, ... , n; has limits from both the right and the left at each partition
1, 2, 3,... , xni; and has a right-hand limit at a and a left-hand limit at b.


Proposition If f is piecewise continuous on [a, b], then f is integrable on [a, b].

Example     The function f in the previous example is piecewise continuous on [0, 2], and
hence integrable on [0, 2] by the previous proposition.

    Now suppose f and g are both integrable on [a, b] and f (x) < g(x) for all x in [a, b].
It follows that for any given partition P, the upper sum of g will be greater than or equal
to the corresponding upper sum of f. Since the definite integral is the largest number less
than or equal to the value of any upper sum, it follows that


b


(4.1.30)


Example Since 0 < x2 < x for all x in [0, 1], we have


                                Odx <    x2dx <     xx.
                             0         0          0O


Now


0


Od= 0(1 -0)= 0


and


J1
0


1


﻿


Section 4.1


The Definite Integral


19


1.2


1


0.8


0.6


0.4


0.2


   0.2   0.4    0.6    0.8     1     1.2

Figure 4.1.11 0 <   x2dx <     xdx
                  0          0


so it follows that


0<


x2


See Figure 4.1.11.

Geometric interpretations
The original motivation for this section was the problem of finding the area of a region in
the plane. Given an integrable function f with f (x) > 0 for all x in an interval [a, b], we
eventually defined the area of the region beneath the graph of f and above the interval
[a, b] to be fa f (x)dx. Now suppose f (x) < 0 for all x in [a, b] and let R be the region


between the graph of f and the interval [a, b].
y = -f(x) and above [a, b], then we have


                 area of R = area of S =    -


If S is the region beneath the graph of


-f(x)dx


af(x jdx.


(4.1.31)


Hence, in this case, fa f (x)dx is not the area of the region R, but rather


                                 f (x)d = -(area of R).


(4.1.32)


﻿


20


The Definite Integral


Section 4.1


-0.2


-0.4


-0.6


-0.8


-1


            1


x   1 and the interval [0, 11


F


Figure 4.1.12 Region between the graph of y


Example     If f(x) = x - 1, then f(x) <0 for all x in [0, 1]. Since the region between the
graph of f and the interval [0, 1] on the x-axis is a triangle with area j, we must have


(x - 1)dz


1
2


See Figure 4.1.12.

    More generally, we may think of fj f(x)dz as representing the difference of the area
of any regions between the graph of f and the x-axis which lie above the x-axis and the
area of those regions which lie below the x-axis. For example, we have


7T

- 7T


sin(x)dzc= 0


because the area of the region beneath the graph of y
negated by the area of the region between the graph y
as shown in Figure 4.1.13.


sin(x) over the interval [0,7r] is
: sin(x) and the interval [-7r, 0],


﻿


Section 4.1


The Definite Integral


21


                                         1


                                         0.5


                                                  1       2       3

                                        0.5


                                        -1

          Figure 4.1.13 Area above the x-axis cancels area beneath the x-axis

Problems

1. For each of the following, find upper and lower bounds for the area of the region
    beneath the given curve over the given interval using four inscribed rectangles and
    four circumscribed rectangles.


   (a) y = - on [1, 5]
           x
   (c) y = x2 + 1 on [-
2. Find the upper and
   equal intervals.

   (a)    3xdx
       41

   (c) cos(x)dx
         1
   (e)    (x3 - x)dx
       l0


2, 2]


(b) y = x2 on [0, 4]

(d) y = sin(x) on [0, 7r]


lower sums for the following integrals using a partition with six

                             4
                       (b) J x2dx
                            -2
                            2
                       (d )   (4 - x2)dx
                            -2
                            1
                       ( f)   sin(27t dt
                            0


3. For each of the following, approximate the area beneath the graph of the function over
   the given interval using the right-hand and left-hand rules with four intervals.
   (a) f(x) = x2 on [0, 4]                   (b) f(x) = x2 on [-2, 2]


           1
(c) g(t) = - on [1, 9]
          t
(e) h~x) = x3 on [0, 1]


           1
(d) g(t) = - on [1, 2]
           t
(f) f (t) =1 -t2 on [-1, 1]


4. For each of the following, approximate the area beneath the graph of the function over
   the given interval using the right-hand and left-hand rules with 100 intervals.


(a) f(x)
(c) f(t)


: x2 on [0, 1]
ts on [0, 2]


(b) g(x) = sin(x) on [0, 7r]
(d) g(z) = z2 on [-2, 2]


﻿


22


The Definite Integral


Section 4.1


           1
(e) h (x) =- on [1, 2]
           x
(g) h (O)=sec (O) on[i,]


(f) f(x) =   1-x2 on [-1,1]

(h) g(t) = sin(2t) on L0, 2]


5. Use the right-hand and left-hand rules with four intervals to approximate the following
   definite integrals.


      2
(a)    x2dx


(c) jcos(t)dt

      1
(e) f(x2 - 1)dz


      3
(b)     - dx
     2

(d) js3ds
     -3

(f) fsin(z)dz


6. Use the right-hand and left-hand rules with 100 intervals to approximate the following
   definite integrals.


      3
(a)    x2dx
    0n
      2
(c) f     4 - t2 dt
     -2
     1
(e) f1(x2 - 1)dx

      -1d
(g)           dx


      2
(b) jx3dx

     /27r
(d) f   sin(x)dx
     0

(f)    sin(36)d6
     0


      -2
(h)
      _4


1
- dt
t


7. Use geometric arguments to determine the value of each of the following definite inte-
   grals.


        4
   (a) fxdx

        3
   (c) j    9 - x2 dx
        2
   (e)    x2dzo

8. Suppose


      3
(b)    (2x + 3)dx

     /2
(d) f4     4 -t2 dt
     -2

  (f)   sin(t)dt
     0


S+1,
4,


if 0 <x < 1,
if 1 <x < 3.


Combine geometric arguments with properties of definite integrals to determine the
value of the following definite integrals.


(a) 1f(x)dx
     0


      3
(b)    f(x)dx
     i


﻿


Section 4.1


The Definite Integral


23


          3                                           2
    (c)j   f(x)dx                              (d)     f(x)dx

                       b
 9. The definition of   f(x)dx assumes a < b.

    (a) Explain why it would be reasonable to define


                                        Jf(x)dz =0.


    (b) Explain why it would be reasonable to define

                                    fb f            ad
                                                  j    (x)dx


        whenever a > b.
    (c) Using the definitions given in (a) and (b), and assuming that f is integrable on
        the appropriate intervals, show that


                                 f(x)dx       f(x)dx       f(x)dx
                              a           ia             c~x~

        whether a < c < b, a < b < c, b < a < c, b < c < a, c < a < b, or c < b < a. Note
        that this generalizes (4.1.29).

10. Suppose f is integrable on [a, b] and m and M are constants such that m < f(x) < M
    for all x in [a, b]. Show that

                                          b
                            m(b - a)       f(x)dx    M(b - a).


11. Given that f is integrable on [a, b], it may be shown that g(x) = If(x)| is also integrable
    on [a, b]. Show that
                                     bf(x)dx    f(x)|dx.


    Hint: Use the fact that -|f(x)|   f(x)   If(x)| for all x in [a, b].

12. In this section we showed that
                                         f2xdz 16

    directly from the definition of the definite integral (with some help from the Mean
    Value Theorem).


﻿


24


The Definite Integral


Section 4.1


(a) Use these ideas to show that


(b) More generally, show that


(c) Let


xdx


1
2


  b

I


xdx


                                 F(x) =     tdt.

What is the relationship between F and the function f(x) = x?


13. In this section we showed that


4
  2x = 16
0


directly from the definition of the definite integral (with some help from the Mean
Value Theorem). Use these ideas to show that


1 l~d


1


﻿


Section 4.2


               to                    Numerical Approximations
       Differential Equations        of Definite Integrals


Computing a definite integral of a function f over an interval [a, b] using upper and lower
sums, or even as the limit of Riemann sums, is, for all but the simplest cases, a difficult task.
As a result, definite integrals are almost never computed in that manner. For the most
part, definite integrals are evaluated either using the Fundamental Theorem of Calculus
or using numerical approximation techniques. We will take up the Fundamental Theorem
of Calculus approach in the next section; in this section we consider several methods for
numerical approximation.

The left-hand and right-hand rules
Recall that for an integrable function f on an interval [a, b], the left-hand rule approxima-
tion for fj f(x)dx, using n intervals, is given by

                                       n-1
                                AL = h     f (a + ih)                        (4.2.1)
                                       2=0

and the right-hand rule approximation by


                                AR = h    f(a + ih),                         (4.2.2)
                                       i=1

where
                                     h=b-a
                                           n
    We now look at the accuracy of these approximations. Let x2 = a+ih, i = 0, 1, 2, ... , n,
the endpoints for a partition of [a, b] using n intervals of equal length h. Assume f is
continuous on [a, b] and differentiable on (a, b) and that x is a point in the ith interval,
that is, x-1 < x < x. Then the Mean Value Theorem tells us that there exists a point c2
in the interval (xi_1, xz) such that

                              f,(CZ)  f(x) - f(x_1)                          (4.2.3)
                                         x - x2-1

Solving for f(x) in (4.2.3), we have

                          f() = f (xi1) + f'(c2)(x - x_1).                   (4.2.4)


1


Copyright Q by Dan Sloughter 2000


﻿


2           Numerical Approximations of Definite Integrals     Section 4.2

Integrating both sides of (4.2.4) over the interval [x2-1, x2i] gives us


              Ji  x~dx =-x   f(x1)dx +      f'(cj)(x - x2-1)dx

                           - fxi(x -x21)+1    f'(c)(x-x_1)dx        (4.2.5)

                       -f(x~1)h +     f'(ci)(x - x_)x

where we have used the fact that the integral of a constant equals the constant multiplied
by the length of the interval. Hence
               fbf(x)dx  Z  f   f(x)dx

                         i=1 fXi1

                         Zf(cci)h +Z]        f'(ci)(cc - xi1)dc     (4.2.6)
                            1 nxxi
                        =AL +~       f'(c)(cc - cc_1)dc.

Thus we have
                   f (c)dcc- AL =fmlxx-2=  i-

                                          i-i xi-1(4.2.7)

                                 n-i  xi1

Now

                 lilf'(ci)(cc - cc_1)dx <  J f'(ci)(cc - cc_1) dcc,      (4.2.8)

(see Problem 11 in Section 4.1), so

                 I b              n l
                               <         f'(cj) (xc - x1~)dxc

                                  n-ixi1
        whee helat quait fllwsfro Vte ac tatc c)1>  fr llc-i_[,i,)c&.  o


﻿


Section 4.2


Numerical Approximations of Definite Integrals


3


                                    Xi -1              xi

             Figure 4.2.1 Graph of y = x - xi_1 over the interval [xi_1, Xi]


since the region beneath the graph of y = x - xi_1 over the interval [_i_, cci] is a triangle
with base and height both of length h =cci - i_1 (see Figure 4.2.1). Substituting (4.2.10)
into (4.2.9), and recalling that h = b-a, we have

       /b f                   Mh2     nMh2      nM(b - a)2     M(b -a)2        (4.2.11)
             f  ~c~ccAL~2                2          2nt           2nt
                          2-1
In other words, the absolute value of the error of the left-hand rule approximation is
bounded by a constant multiplied by n. This explains the behavior of the example in
Section 4.1 where we saw that doubling the number of intervals would decrease the error
by a factor of 1. The same techniques yield a similar result for the right-hand rule.

The trapezoidal rule
For a decreasing function, the left-hand rule is an upper sum and the right-hand rule is a
lower sum; for an increasing function, the left-hand rule is a lower sum and the right-hand
rule is an upper sum. Hence, for such functions, it would seem that the average of the
left-hand and right-hand rules, that is,

                                       AL + AR
                                          2

should provide a better approximation to fa f(x)d than either AL or AR. We will now
show that this is true in general.
    Suppose f, f', and f" are all defined and continuous on [a, b]. From (4.2.4) we know
that for any c in the interval [iI, cci] there exists a point ci in (2i_1, ci) such that

                           f(cc) =f(cci) + f'(ci)(x - ci_1).                   (4.2.12)

Similarly, there exists a point di in (XiI, cci) such that


f (x) = f (xi) + f'(di)(x - c).


(4.2.13)


﻿


4           Numerical Approximations of Definite Integrals   Section 4.2

Using f'in place of f in (4.2.4), there exists a point p2 in (x1, cZ) such that

                     f,'(C2)= f'(x1) + f"(p)(ci - xi-1)         (4.2.14)

and a point qj such that

                     f'(d~)_ f'(x1) + f"(q)(di - xi 1)          (4.2.15)

Substituting (4.2.14) into (4.2.12) and (4.2.15) into (4.2.13), we have

      f W) - f(x1) + (f'(x_~1) + f"(pi)(ci - xj-i))( x-1)(..6
          - f (X-1) + f'(xi-1)(x - X2-1) + f//(pi)(ci - xi1)(x - xi1) (..6

and
         f(x)_ f(x~) + (f'(x1~) + f"(q)(dz - xj-1))(x - xi)
             =f (xi) + f '(xi-1) (x - xi) + f "(qj) (di - Xi- )(X - Xi).(217

Taking the average of (4.2.16) and (4.2.17) gives us

        fW =f W)+ f )
                2
           f(xi~) + f(cxj)+ f'(x_-1)((x - x2-1) + (xc - x2))           (4.2.18)
                 2      +2
               + "()(- cc-1)(x - x1) + f"r(qg)(d - x1)(x - x~)
                   +                2

Now  Xi f( i 1

         [Xi                 1)f + f (Xi) dx f= x 2 + f (X) (x - x-=f (xi-i)2 + f Xi h (4.2.19)
      2x2-2

and


﻿


Section 4.2


Sectin 4.2Numerical Approximations of Definite Integrals5


5


Xi- 1


Xi


Figure 4.2.2 Graph of y


xi-1 + xi
x 2


over the interval [xi.


1, xi]


and the interval [xi-1, xi] forms two triangles of equal area, one above the x-axis and one
below (see Figure 4.2.2). Moreover, if K is the maximum value of f"(x) for x in [a, b],
then


f"(i)ci- xi)(x - xi-1) + f"(qi)(di - xi)(x - xi)
                     2


  K
< -(ci


- xi-1 x - x~i


+ di -xi-1 x- xi) .


Since the points ci, di, and x are all in [xi-1, xi], it follows that


f"(i)ci- xi)(x - xi-1) + f"(qi)(di - xi)(x - xi)
                     2


 f"(i)ci- xi)(x - xi-1) + f"(qi)(di - xi)(x - xi)


< Kh2,


(4.2.21)


Hence


   Ix'X


dx <


i.-1


Kh2 dx.


2


Now


i-1


Kh2dx


Kh2 (xi - xi-1)


Kh3,


(4.2.22)


so we have


IX Xi f"11(pi) (ci - xi
xi-1


1) (x - xi-1) + f"(qi)(di - xi)(x - xi)
          2


dx < Kh3.   (4.2.23)


﻿


6


Numerical Approximations of Definite Integrals


Section 4.2


Putting this all together, we have


Ib


n    x2
    x  f(x)dx
  ii xi_1

ii lxi f(x) +f(x) d
i 1  2i-1

        n 2f' (xi)((x-x 1) + (x-xi) )
    i=1 I-12

    fi=1   f"(pi)(c- -11)(x - 1) +f
      i=I2-1                    2


dx


P/ q)(i- xi1)(x - xi) dx


  n f  (x 2 1) 2 +h +0x


hii f(x_)+E"f~hO
          2
     +h i          -
  i=1

       nf"( pi)(ci - xi1)(x

   i=1 i-:1
hn LI f(xi) + h Ln If (x2 )
     j      2


AL + AR
   2
       n 2 f"(pi)(ci -  i_1)(x

   i=1 x2-1


x2-1) + f"(q1)(d1
    2


x2-i)(x - x) dx


xi_1) + f"(q)(di
    2


xi - ) + f "(qi) (di
    2


xi_1)(x-xi) dx


i_1)(x - i)dx,


from which it follows, using (4.2.23), that


jb f(x)dx


AL+ 4R<  Kh3
2


nKh3 = nK b - a)


K(b - a)3


(4.2.24)


That is, if we approximate fj f(x)dx by

                                AL + AR
                                   2

then the absolute value of the error is bounded by a constant multiplied by n . In par-
ticular, if we double the number of intervals, we should expect the error to decrease by a
factor of


1 2
(2)


1
4


﻿


Section 4.2


Numerical Approximations of Definite Integrals


7


We call


      AL + AR
AT = 2
          2


(4.2.25)


a trapezoidal rule approximation for fj f(x)dx. The name comes from the fact that (4.2.25)
may also be derived by replacing the areas of rectangles by the areas of trapezoids in the
Riemann sum approximations (see Problem 6).
Example In Section 4.1 we saw that the left-hand and right-hand rule approximations
for f 1(x2 + 1)dx using n = 6 intervals are AL = 5.375 and AR = 6.875. Hence the
corresponding trapezoidal rule approximation is

                                  5.375 + 6.875
                            ATr                  6.125.
                                        2

With n = 100, the left-hand and right-hand rules give usAL = 5.95545 and AR = 6.04545,
yielding a trapezoidal rule approximation of


      5.95545 + 6.04545
AT =2
              2


6.00045.


Example In a Section 4.1 we approximated

                                         /10 1
                                   A    f    'dt
                                        i1  t

using both left-hand and right-hand rules and, after noting that to 6 decimal places A
2.302585, obtained the following table of values:


n
10
20
40
80
160
320


  AR
1.960214
2.116477
2.205491
2.253003
2.277534
2.289994


  AL
2.770214
2.521477
2.407991
2.354253
2.328159
2.315307


A-AR
0.342371
0.186108
0.097094
0.049582
0.025052
0.012591


A -AL
0.467629
0.218892
0.105406
0.051668
0.025574
0.012722


Using these results, we may compute the following trapezoidal rule approximations:


n
10
20
40
80
160
320


  AT
2.365214
2.318977
2.306741
2.303628
2.302846
2.302650


|A-AT|
0.062629
0.016392
0.004156
0.001043
0.000265
0.000065


We see that the errors in the trapezoidal rule approximations are significantly smaller than
the corresponding errors for the left-hand and right-hand rule approximations. Moreover,


﻿


8


Numerical Approximations of Definite Integrals


Section 4.2


in agreement with our work above, the errors in the trapezoidal rule approximations de-
crease by a factor of, roughly, 4 when we double the number of intervals. Hence we see
the trapezoidal rule approximations converging to the value of the definite integral at a
significantly faster rate than do the left-hand and right-hand rule approximations.

The midpoint rule
As above, let f be an integrable function on an interval [a, b], n a positive integer, h  b=-a
and, for i = 0, 1, 2, ... , n, x = a+ih, (the endpoints of a partition of [a, b] with n intervals
of equal length h). We may think of the trapezoidal rule as improving on the left-hand
and right-hand rules by approximating the area of the region beneath the graph of f and
above the interval [x2-1, xi] using a rectangle with height equal to the average of f(cc_1)
and f(c). Another approach is to average xi_1 and x2 before evaluating f. Since each
interval is of length h, we may find the midpoint by adding 2 to the left-hand endpoint.
Namely, if we let

                           h
                  c=a +-,
                           2


          h
c2 = X1 + -=
          2
          h
c3 =x2+-=
          2


            h
ci =i_1 + -
            2


        h        3
a + h +- =a+-h,
        2        2
        h         5
a+2h+- a+-h,
         2        2


(4.2.26)


               h
a+ (i-1)h +-=
               2


               h
 a +Q(n- 1)h + -


- h,
2 '


             h
c =n_1 + -
             2


= a + n


1 \
-) h,
2


then c2 is the midpoint of the ith interval [Xi, zc2]. We call


        n2
AM =Zf (ci)h
       i=1


   n
h 3 f (a +(i
  i=1


1      , \


(4.2.27)


the Riemann sum formed by evaluating f at the points c1, c2, c3, ..., c1, a midpoint rule
approximation of the definite integral f f(x)d.


Example To find the midpoint rule approximation for f 21(x2 +1)dcc using n
vals, we would have
                                  h   2 - (-1) =
                                          6


6 inter-


Then the interval endpoints are co = -1, c1=
and cc = 2, from which we find the midpoints c1


-0.5, X2 = 0, X3 = 0.5, X4 = 1, cc = 1.5,
-0.75, c2 =-0.25, c3= 0.25, c4 =0.75,


﻿


Section 4.2


Numerical Approximations of Definite Integrals


9


5


4


3


-1


1          2


              Figure 4.2.3 Midpoint rule approximation for f (x2 + 1)dz
                                                        2-1


c5 = 1.25, and c6 = 1.75 (see Figure 4.2.3). Thus, letting f(x) = x2 + 1, we have

        AM = 0.5(f (-0.75) + f (-0.25) + f (0.25) + f (0.75) + f (0.125) + f (1.75))
            = 0.5(1.5625 + 1.0625 + 1.0625 + 1.5625 + 2.5625 + 4.0625)
            = 5.9375.

With n = 100 intervals, using (4.2.27) with a computer, we have AM = 5.999775.

Example     Applying the midpoint rule to the problem of approximating


                                   A 1=1 { dt,
                                        i  t

we obtain the following table (again rounded to 6 decimal places):


﻿


10


Numerical Approximations of Definite Integrals


Section 4.2


                    1

                    0.8

                    0.6

                    0.4

                    0.2


                              2        4       6        8       10
                                                              r10 1
                 Figure 4.2.4 Midpoint rule approximation for  - dt
                                                              i t


                          n            AM        |A-AM|
                          10         2.272740    0.029845
                          20         2.294504    0.008081
                          40         2.300515    0.002070
                          80         2.302064    0.000521
                          160        2.302455    0.000130
                          320        2.302552    0.000033

Notice that, as with the trapezoidal rule, doubling the number of intervals decreases the
error by a factor of about . Moreover, note that the error in each approximation is
approximately 1 of the corresponding error for the trapezoidal rule.

    An analysis of the error in the midpoint rule, similar to that which we did above
for the left-hand, right-hand, and trapezoidal rules, would show that the absolute value
of the error is bounded by a constant multiplied by -9. Hence doubling the number of
intervals will decrease the error by, roughly, a factor of1, as was evidenced in the previous
example. Moreover, a more careful examination of the error (one requiring the use of
Taylor polynomials, which we will discuss in Chapter 5) would show that there is a sense
in which it is typically on the order of 1 the size of the error of the trapezoidal rule.

Simpson's rule
We saw above that averaging the left-hand and right-hand rules, two approximation meth-
ods with errors bounded by a constant multiple of , resulted in an approximation method,
the trapezoidal rule, with an error bounded by a constant multiple of y. We might now
think that we could improve on the trapezoidal and midpoint rules, two rules with errors
bounded by a constant multiple of y, by computing their average. However, it turns out
that the relationship between these two rules is not as simple as with the left-hand and
right-hand rules; in fact, one really needs to use Taylor polynomials in order to understand
the error terms fully. At the same time, there is a hint in our previous example. Given


﻿


Section 4.2


Numerical Approximations of Definite Integrals


11


that the error from the midpoint rule was about 1 of the error of the trapezoidal rule,
it would be reasonable to guess that perhaps an average of the two which gives twice as
much weight to the midpoint rule would be appropriate. This in fact turns out to be the
right mixture, and we define

                                1      2       AT + 2AM
                          AS = -AT + -AM =         3     .(4.2.28)
                                3      33

We call As a Simpson's rule approximation for f f(x)dx. This method of approximating
definite integrals is named for Thomas Simpson (1710-1761). Simpson developed this rule
in 1743 as a method for approximating the area under a curve after first approximating
the curve with a number of parabolic arcs.
Example Using n = 6 intervals, we saw above that the trapezoidal rule approximation
for f 1(x2 + 1)dx is AT = 6.1250 and the midpoint rule approximation is AM = 5.9375.
Thus the corresponding Simpson's rule approximation is

                              6.1250 + (2) (5.9375) 186
                                       3             3

With n = 100 intervals, we have AT = 6.000450 and AM = 5.999775, giving us

                            6.000450 + (2) (5.999775) 18-6
                                       3               3

It may seem surprising in this example that we get the same result with 100 intervals as
we do with 6, but in fact this is the exact answer. It may be shown, either by careful
examination of the error or by deriving the rule from parabolic approximations, that
Simpson's rule will find the exact value for the definite integral of any quadratic polynomial.
What is even more surprising, careful examination of the error using Taylor polynomials
shows that Simpson's rule is exact for cubic polynomials as well.
Example Using the values for the trapezoidal and midpoint rule approximations ob-
tained above, we have the following approximations for


                                   A    110   dt
                                         it

using Simpson's rule:

                         nt           As        |A -As|
                         10         2.303565   0.000980
                         20         2.302662   0.000077
                         40         2.302590   0.000005
                         80         2.302585   0.000000
We have stopped the table at 80 intervals because at that point the approximation is
accurate to 6 decimal places.


﻿


12


Numerical Approximations of Definite Integrals


Section 4.2


                    1

                  0.8

                  0.6

                  0.4

                  0.2


                            0.5     1      1.5    2      2.5     3
             Figure 4.2.5 Region beneath y = sin(x) over the interval [0, r]


    It may be shown that the absolute value of the error using Simpson's rule is bounded
by a constant multiple of n9, resulting in a dramatic improvement over both the trape-
zoidal and midpoint rules. For Simpson's rule, doubling the number of intervals typically
decreases the error by a factor of
                                       1 4    1


a general fact for which we can see some evidence in the preceding example. To be fair,
since Simpson's rule makes use of both the trapezoidal rule and the midpoint rule approxi-
mations, the function being integrated must be evaluated both at the endpoints and at the
midpoint of each interval. This requires evaluating the function at 2n + 1 points, whereas
the trapezoidal rule evaluates the function at n + 1 points and the midpoint rule evalu-
ates the function at n points. Thus, for direct comparison of errors, one should compare,
for example, Simpson's rule with 10 subdivisions to the other rules using 20 subdivisions.
Nevertheless, Simpson's rule converges to the value of the integral much faster than the
other methods and, hence, is the method of preference among the ones we have discussed.
Even faster methods exist, but we will leave them for a more advanced course.
    When approximating the value of an integral, there is in general no practical way to
know how many intervals are necessary in order to obtain an approximation to a desired
level of accuracy. Analogous to the way in which we applied Newton's method, we normally
compute a sequence of approximations, perhaps starting with only two intervals and then
doubling the number of intervals from one approximation to the next, stopping when we
obtain two successive approximations whose difference, in absolute value, is less than the
desired level of accuracy. The next example illustrates this procedure.

Example Suppose we wish to approximate, with an error less than 0.0001, the area A
of the region between one arch of the curve y =sin(x) and the c-axis, as shown in Figure
4.2.5. That is, we want to find

                                  A =fsin(x)j dz.


﻿


Section 4.2


Numerical Approximations of Definite Integrals


13


Starting with n = 2 intervals and using Simpson's rule, we generate the following table of
approximations, rounding to 6 decimal places:


n
2
4
8
16


  As
2.004560
2.000269
2.000017
2.000001


Since the absolute value of the difference between the last two approximations is less than
0.0001, we stop at this point and use 2.0000 as our approximation for A.

Problems

1. Approximate each of the following integrals using the trapezoidal and midpoint rules
    with n = 4 intervals.


(a) fx2dx


(c)     - dt


(e)    (t2 + t)dt


(b) fsin(x)dx


(d) f z3dz

      2
(f ) f4 -x2dz


2. Use your results from Problem 1 to compute the corresponding Simpson's rule approx-
   imation for each integral.

3. Approximate each of the following integrals using the trapezoidal and midpoint rules
   with n = 50 intervals.


(a) x2dx

(c) f(4x2 + 3x - 6)dx
     /-2

(e) fx2cos(x)dx


(g) i2+1 dx


(b) fsin(x)dx


(d) f 2 sin(x) dx


    (f)    1 - sin2(t) dt

(h) f   sin(3x)cos(x)dx


4. Use your results from Problem 3 to compute the corresponding Simpson's rule approx-
   imation for each integral.

5. Approximate the following definite integrals using Simpson's rule. Starting with n = 2
   intervals, compute a sequence of approximations by doubling the number of intervals
   from one approximation to the next. Stop when the absolute value of the difference
   between two successive approximations is less than 0.00001.


﻿


14


Numerical Approximations of Definite Integrals


Section 4.2


(a) f x2dz


(c) fsin2(x)dz


(e) J    /1 + cos2(O) dO


      6
(b)    (3x2 +44x - 3)dz


(d)     sin2(x) cos2(x)dz


(f)   j     )d


6. Suppose f is integrable on the interval [a, b]. Divide [a, b] into n equal intervals of
   length h =-a and let o, x1, X2, ..., ccn be the endpoints of these intervals. Let AT
   be the trapezoidal rule approximation for fa f(x)d.
   (a) Show that

                   ATh
                AT = -(f(zo) + 2f(zi) + 2f(x2) + ... + 2f(xn_1) + f(xn)).

   (b) Assume f(x) ;> 0 for all x in [a, b]. For i = 1, 2, 3, ..., n, let Ai be the area of the
      trapezoid with vertices at (x1i, 0), (x1i, f(ci1)), (xi, f(zc)), and (xi, 0) (that is,
      AZ is the area of a trapezoid with one side being the interval [xi_1, zi] and parallel
      sides extending from the x-axis up to the graph of f). Then we could approximate
      fjbf(x)dz by A1 +A2+A3+---+An. Show that

                              AT=Al+A2+A3+---+An.

7. Suppose f is integrable on the interval [a, b]. Divide [a, b] into 2n equal intervals of
   length h = 2   and let o,, X1, X2, ..., x2n be the endpoints of these intervals. Let
   AT and AM be the trapezoidal rule and midpoint rule approximations for f f(x)d
   using the n intervals with endpoints o, X2, X4, ..., x2n. Let As be the corresponding
   Simpson's rule approximation. Show that

   As =h(f(o) + 4f (ci) + 2f(x2) + 4f (a) + ... + 2f (x2n-2) + 4f (x2n-1) + f(x21)).


8. Let T(t) be the temperature at t hours after midnight at the Kalispell airport and
   suppose the following values for T were recorded on March 15, 1955:


Time (t)
Temperature (T)
Time (t)
Temperature (T)
Time (t)
Temperature (T)


0.0
40
5.0
24


0.5   1.0   1.5   2.0   2.5   3.0   3.5   4.0   4.5
38     37    36    33    30    28    28    27    26
5.5   6.0   6.5   7.0   7.5   8.0   8.5   9.0   9.5
26     30    30    32    35    37    38    40    45


10.0 10.5 11.0
47     47    48


11.5 12.0
49     50


(a) Approximate f0'2 T(t)dt using Simpson's rule. You may wish to use the formula in
    Problem 7.


﻿


Section 4.2            Numerical Approximations of Definite Integrals        15

   (b) What does
                                          12
                                   A =1fT(t)dt

       represent?
                                 L24     t
    (c) How does A compare with 25jT ()?
                                  t=0
 9. Find the area beneath one arch of the curve y =sin2(x).
10. Let R be the region in the plane bounded by the curves y = x2 and y = (x - 2)2 and
    the x-axis. Find the area of R.


﻿


                                    Section 4.3
       Difference Equations
               to                   The Fundamental Theorem
       Differential Equations       of Calculus


We are now ready to make the long-promised connection between differentiation and in-
tegration, between areas and tangent lines. We will look at two closely related theorems,
both of which are known as the Fundamental Theorem of Calculus. We will call the first
of these the Fundamental Theorem of Integral Calculus.
    Suppose f is integrable on [a, b] and F is an antiderivative of f on (a, b) which is con-
tinuous on [a, b]. In particular, F'(x) = f (x) for all x in (a, b). Let P = {xo, x1, x2, ... , xn}
be a partition of [a, b] and, as usual, let A0x = x2 - x-1, i = 1, 2, 3, ... , n. Now


F(b) - F(a)


F(xn) - F(xo)
F(xn) + (F(xn_1) - F(xn_1)) + (F(xn-
  + (F(xi) - F(xi)) - F(xo)
(F(xn) - F(xn_1)) + (F(xn_1) - F(xn-
  + (F(xi) - F(xo))

E(F(xi) - F(xg-1)).
i=1


2) - F(xn-2)) + ...


(4.3.1)


By the Mean Value Theorem, for every i
interval [xi_1, x] such that


1, 2, 3, ... , n, there exists a point c2 in the


F'(c)   F(xi) - F(xo-1)
           Xi - x()=


(4.3.2)


Since F'(c2) = f(c2) and xi - xi_1


Ax, it follows that


F(xi) - F(xi_1)


f(c2)Azx.


(4.3.3)


Hence, putting (4.3.3) into (4.3.1),


                            F(b) - F(a) =


Thus F(b) - F(a) is equal to the value of a
must lie between the upper and lower sums
partition P,
                         L(f, P)  F(b)


   f(c2)Ax2.
i= 1


(4.3.4)


Riemann sum using the partition P, and so
for P. That is, we have shown that for any

F(a)    U(f, P).                  (4.3.5)


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Fundamental Theorem of Calculus


Section 4.3


1


0.8

0.6

0.4


         0.2

-0.5


0.5


1         1.5


Figure 4.3.1 Region beneath the graph of f(x) = x2 over the interval [0, 1]


But, since f is integrable, there is only one number
f f(x)dx. In other words, we have shown that
                                 fb
                                 abf (x)dz = F(b) - F(


that has this property, namely,


,).                      (4.3.6)


z


Fundamental Theorem of Integral Calculus If f is integrable on [a, b] and F is an
antiderivative of f on (a, b) which is continuous on [a, b], then


I b


f(x)dxz= F(b) - F(a).


(4.3.7)


    This result reveals a sense in which integration is the inverse of differentiation: The
definite integral of a function f may be evaluated easily, using (4.3.7), provided we can
find a function F whose derivative is f.
    It is common to write
                                            b
                                       F(x)
                                            a
for F(b) - F(a). With this notation, (4.3.7) becomes


  f (x)dz = F(x) .
fb               b


(4.3.8)


Example Since
                                            13
                                    F(x) = -x

is an antiderivative of f(x) = x2, we have


/ 1          1
   Jx2dz   -z3
 0         3 0


1
3


    1
0   -.
    3


Thus the area under the parabola y
exactly }. See Figure 4.3.1.


x2 and above the interval [0, 1] on the x-axis is


﻿


Section 4.3


The Fundamental Theorem of Calculus


3


                    1

                    0.8

                    0.6

                    0.4

                    0.2


                            0.5     1      1.5     2      2.5     3

       Figure 4.3.2 Region beneath the graph of y = sin(x) over the interval [0,7r]


    Note that F in the previous example is but one of an infinite number of antiderivatives
of f. We can in fact use any antiderivative of f we want in applying (4.3.7), although we
typically choose the simplest one we can find.


Example Since


        13
G(c) =-a +cc
        3


is an antiderivative of g(x) = x2 + 1 (you may check by differentiating G), we have


f (x2 + 1)dcc


            2

(C3x3 + x)
            -1


+ 2 -


1


1)


6,


as we claimed in Section 4.2.

Example If A is the area under one arch of the curve y = sin(x), then


                                   A = sin(x) dz.
                                         O


Since F(x)


cos(x) is an antiderivative of f(x) = sin(x), we have


A =     sin(x)dcc
      0


cos(x)
       0


cos(r) - (- cos(0)) = 1 + 1 = 2.


See Figure 4.3.2.

Example Since


        43
F(cc)=-a
        3


1
-c2 + 2x
2


﻿


4                  The Fundamental Theorem of Calculus                       Section 4.3

is an antiderivative of f(x) = 4x2 - x +2 (again, you may check this by differentiating F),
we have

                    (4x2 - x + 2)d (   3 -   2 +2x1
                                                           -2

                                       = (+6)--2-4)

                                       325
                                       6

Example Since
                                              2 3
                                       F(t)=-t2
                                              3
is an antiderivative of f(t) = tf, we have

                           f4           2 3 4    16       16
                             f   t dt=  -tz    - -  -


    As can be seen from these examples, the Fundamental Theorem of Integral Calculus
provides us with a powerful tool for evaluating definite integrals exactly. However, to utilize
the theorem we must first find an antiderivative for the function we are integrating. This
turns out to be a difficult problem in general, and we will devote the next two sections, as
well as parts of Chapter 6, to developing techniques to aid in finding antiderivatives. For
example,
                        13            3              3            3
              F(x)    --x3 cos(2x) + -cc2 sin(2x) + -X cos(2x) - - sin(2x)
                        2             4             4             8
is an antiderivative of f(x) = x3 sin(2x), as may be checked by differentiation, but at this
point it is not clear how to find such an antiderivative in the first place. Moreover, there
are integrable functions, even relatively simple ones such as

                                     f (c) - sin(x)

which do not have antiderivatives expressible in terms of the elementary functions studied
in calculus.
    The Fundamental Theorem of Integral Calculus tells us that if a function f has an
antiderivative, then we may use that antiderivative to evaluate a definite integral of f, but
it does not tell us which functions have antiderivatives. The Fundamental Theorem of Dif-
ferential Calculus will tell us, in part, that every continuous function has an antiderivative.
Before beginning that discussion, we need to extend the definition of the definite integral
slightly.
    The definition of fa f(cc)dcc in Section 4.1 implicitly assumes that a < b. For the work
we are about to do, we need to extend the definition to include a;> b, as we did in Problem
9 of Section 4.1. First of all, if a =b, it would seem reasonable for the value of the definite


﻿


Section 4.3


The Fundamental Theorem of Calculus


5


integral to be 0 since the region between the graph of the function and the x-axis has been
reduced to a line segment. Hence we make the following definition.

Definition For any function f defined at a point a, we define


I ra
a f (x)dx 0


(4.3.9)


Note that with this definition, the statement


fb d


jf(x)dx+jf(x)dx


(4.3.10)


which we discussed in Section 4.1 in the case a < c < b, holds true even if a = c, b = c, or
a = b = c. Now suppose we have a < b < c. Then


J   f
  nf (x)dz


f (x)dx +  f (x)dx,
a b


(4.3.11)


from which it follows that


jb


j f(x)dx


f (x)dx.


(4.3.12)


If we define


f (x)dx


C
If (x)dx,
b


(4.3.13)


then we may rewrite (4.3.12) in the form of (4.3.10). For this reason, we make the following
definition.

Definition If b < a and f is integrable on [b, a], we define


jb


  f (x)dx.
b


(4.3.14)


    You may check that with these two extensions to the definition of the definite integral,
we may now state the following proposition.

Proposition If f is integrable on a closed interval containing the points a, b, and c, then


fb (x)dx


jf(x)dx+f(x)dx.


(4.3.15)


﻿


6


The Fundamental Theorem of Calculus


Section 4.3


                                                       y =f(t)


                         a                         x        b

                Figure 4.3.3 F(x) = fa f(t)dt is the area from a to x


    We may now return to our discussion of antiderivatives and the Fundamental Theorem
of Differential Calculus. Suppose f is continuous on the interval [a, b]. We want to construct
an antiderivative for f on (a, b). From the Fundamental Theorem of Integral Calculus, we
know that if F is an antiderivative of f on (a, b) which is continuous on [a, b], then for any
c in (a, b) we would have
                                f (t)dt = F(x) - F(a),                    (4.3.16)

that is,

                             F(x) = F(a) +    f (t)dt.                    (4.3.17)

Hence, if we are seeking an antiderivative for f, it makes sense to define

                                F(x)   f   f(t)dt                         (4.3.18)

and verify that F'(x) = f(x) for all x in (a, b). Note that F(x), geometrically, is the
cumulative area between the graph of f and the x-axis from a to x, as shown in Figure
4.3.3. We need to compute

                Fh-   + h)-F    )     m1     x4Vh          x
      F'(x)   l _ F(cc+ h) - F(c)   li h-Yxa  f(t)dt -  fxf(t)dtJ    (4.3.19)

for x in (a, b). Now
                         j hf (t jdt= f f(t jdt + f f(t jdt,              (4.3.20)


                       x+h            xx+h

                       I    f(t)dt -] f(t)dt    ]    f (t)dt.             (4.3.21)


﻿


Section 4.3


The Fundamental Theorem of Calculus


7


y = f(t)


a                   x   x+h          b


Figure 4.3.4 yx+h f(t)dt


f a
x           x
'- f (t)dt f x'+' f (t)dt


See Figure 4.3.4. Thus


            1   x+h
F'(x) = lim -  


f (t) dt.


(4.3.22)


Suppose h > 0. Since f is continuous, f has a minimum value m(h) and a maximum value
M(h) on the interval [x, x + h]. Hence m(h)   f(x)   M(h) for all x in [x, x + h], from
which it follows that


                     x+h              x+h            x+h
                          m(h)dts (   f(t)dt i

Since m(h) and M(h) are constants, (4.3.23) implies


M(h)dt.


(4.3.23)


m(h)h < f   f(t)dt < M(h)h.

           /x+h
           x


        1 x+h
m(h)            f (t)dt  M(h).


(4.3.24)


Thus


(4.3.25)


Now m(h) = f(c) for some c in [x, x + h]. Moreover, as h approaches 0, x + h approaches
x, and so c must also approach x. Hence, since f is continuous,


lim m(h)
h 0+


lim f (c) = f (x).
h 0+


(4.3.26)


Similarly,


lim M(h) = f (x).
h-0o+


(4.3.27)


﻿


8


The Fundamental Theorem of Calculus


Section 4.3


It now follows from (4.3.25) that


      1 x+h
 lim -
h-0A+ hx


f(t)dt = f(x).


(4.3.28)


A similar argument shows that


        x+h
lim  -    f (t)dt =f (0),
   h  I0xh,


(4.3.29)


and so we may conclude that


               1x+h
F'(x) = lim - I f(t)dt  f(x).
        h-0O h ix


(4.3.30)


That is,


is an antiderivative of f on (a, b).


F(x) =  f(t)dt


Fundamental Theorem of Differential Calculus
interval [a, b] and F is the function on (a, b) defined by


If f is continuous on the closed


F(x) =     f (t)dt,


(4.3.31)


then F is differentiable on (a, b) with F'(x) = f(x) for all x in (a, b). In other words,


j    x f (t)dt f(x)
d.T fa


(4.3.32)


for all x in (a, b).
    It is worth noting that (4.3.32) holds for x < a as well, as long as f is continuous on
a closed interval which contains both x and a.


Example Let


sin(x)

1,


if #0,

if x=0.


Then f is continuous on (-oc, oc), so


                                  F(x)        f(t)dt
                                           0


﻿


Section 4.3                  The Fundamental Theorem of Calculus                    9

                    1
                                                              1.5
                   08
                                                               1
                    .6                                        0.5
                    0.4
                                              -10      -5                5        10
                   0.2                                       -0.
                                                              -1
   -0       -5                 5        0                    -.5
                  -0.2

             Figure 4.3.5 Graphs of f(X) =sin(c) and F(x) f sin(t)dt


is an antiderivative of f on (-oc, oc). In particular,

                                    F' (cc) -sin(x)

for all x # 0. The graphs of F and f are shown in Figure 4.3.5. Geometrically, F(x) is the
cumulative area between the graph of f and the x-axis from 0 to x and F'(x) is the rate
at which area is accumulating as x increases. Since the rate at which area is accumulating
depends on the height of the curve, it is natural to expect, and the Fundamental Theorem
of Differential Calculus confirms, that F'(x) = f(x). The function F is known as the sine
integral function. It may be shown that it is not representable in closed form in terms of
the elementary functions of calculus.
Example Using Leibniz notation,

                               d fx
                                     sin(t2)dt = sin(x2).


Example Suppose

                                 G(x)        sin(t2)dt.

Then G(x) = F(h(x)), where h(x) = 3x and

                                 F (cc) J sin(t2)jdt.

Hence, using the chain rule,

                  G'(cc) =F'(h(cc))h'(cc) =sin((3cc)2)(3) =3 sin(9cc2)

Example Suppose

                                 H (cf= dct.


﻿


10               The Fundamental Theorem of Calculus             Section 4.3

Then, using (4.3.14),

                           H (x) =- J   1+ 4dt,
so
                               d   x  1           1
                     H() =       f        dt =-       .
                              dxJ0 1+t4         1+X4

Example Suppose
                                    x2
                           F(x) =      /'1 + t4dt.
                                  Jfx
Then, using (4.3.15) and (4.3.14),

                         O1 x2t2x x2
    F(x) =  0 /1 + t4 dt + x\/V1 + t4 dt =  2x(\/1 +t0 dt + J  / +tndt
           J2x JJJ
Note that there is nothing special about using 0 in this decomposition, other than the
requirement that the function f (t) =v1 + t4 be integrable on all of the relevant intervals.
Now we have
                            d/2x            d/x2
                F'(x)=d           1+t4 dt+dJ        1+t4 dt

                       -   1+ (2x)4 (2) +/1+ (x2)4 (2x)

                     =2z/1+x8-2/1+16x4.

   To summarize this section, the Fundamental Theorem of Integral Calculus provides
us with an elegant method for evaluating definite integrals, but is useful only when we
can find an antiderivative for the function being integrated. The Fundamental Theorem
of Differential Calculus tells us that every continuous function has an antiderivative and
shows how to construct one using the definite integral. Unfortunately, this brings us in
circle and does not provide us with an effective means for finding antiderivatives to use in
applying the Fundamental Theorem of Integral Calculus. For example, we know that

                             F(x)   [sin(t)dt

is an antiderivative of
                                   f~)-sin(w)

but this is of no help in evaluating, say,


                                f4 sin(w)dx

Hence in order to fully utilize the Fundamental Theorem of Integral Calculus in the evalua-
tion of definite integrals, we must develop some procedures to aid in finding antiderivatives.
We will turn to this problem in the next section.


﻿


Section 4.3


The Fundamental Theorem of Calculus


11


Problems

1. Evaluate the following definite integrals using the Fundamental Theorem of Integral
    Calculus.


(a) fx3dx


(e) f(2x3 + 3x2 + x


(g) j4 / dt

(i)    sin(x)dx


4)dx


      3
(b)    (x2+2x)dx

(d) fx3dx


(f)     2 dx

(h) f     + 1id

(j) jcos(z)dz


2. Evaluate the following definite integrals using the Fundamental Theorem of Integral
   Calculus.


(a)    (x + 1)2dx

(c)      1 +2t dt

(e) fsin(2x)dzx

(g) 4 sin(3x)dzx


(i) f   2x sin(x2)dzx


(k) f   x sin(x2)dzx
     0//


      2
(b)    (2x+1)2dx

(d)    sec2(x)dx
     0
(f)  f5cos(3x)dx

(h) j8cos(5)d

      2
 (j) f 2x(1 + x2)5dx
      2
 (1)J x(1 + x2)5dz
     -1


3. For each of the following functions, graph both f and

                                 F(x) =_    f(t)dt
                                          0


together over the given interval.
(a) f (x) = sin(x) on [-27, 27]
             1
(c) f (x)        o 4n [-3,3]1


(b) f (x) = sin(x2) on [0, 10]
             1
(d) f (x) =     o +1n [0, 10]


﻿


12


The Fundamental Theorem of Calculus


Section 4.3


4. Find the derivatives of each of the following functions.

   (a) F(x) = fsin2(4t)dt                   (b) g()   ft     2 dt

   (c) F(x) = fcos3(t)dt                    (d) G(t)        4 - z2 dz


   (e) f(x)  f    1+s2 ds                   (f) h(z)         1 + t2 dt

5. Evaluate the following derivatives.
       d       1   dt                                3h sdn(3t)
   (a)      1+t2                                dI            dt

   (c) i JI sin2(3x)dz                      (d) _         1    dt
       dt J                                     dx 3x   /1+ t2
6. Find the area of the region beneath one arch of the curve y = 3 sin(2x).
7. Let R be the region bounded by the curves y = x2 and y = (x - 2)2 and the x-axis.
   Find the area of R.
8. Explain why the integral
                                    J/10
                                    f(cc-cc2 )dcc

   is the area of the region bounded by the curves y = x2 and y = x. Find this area.
9. Explain why the integral

                                   f'(2 - 2x2)d

   is the area of the region bounded by the curves y = 1 - x2 and y = x2 - 1. Find this
   area.


﻿


Section 4.4


       Differential Equations          Using the Fundamental Theorem


As we saw in Section 4.3, using the Fundamental Theorem of Integral Calculus reduces
the problem of evaluating a definite integral to the problem of finding an antiderivative.
Unfortunately, finding antiderivatives, even for relatively simple functions, cannot be done
as routinely as the computation of derivatives. For example, suppose we let f(x) = sin(x),
g (x) = x, and
                                        f(x) _ sin(x)
                                    h~x) .
                                        g(x)     x
Then, since we know the derivative of f and we know the derivative of g, it is a simple
matter to find the derivative of h using the quotient rule. However, knowing the an-
tiderivatives of f and g in no way helps us find the antiderivative of h. In fact, it has been
shown that the antiderivative of h is not expressible in terms of any finite combination of
algebraic and elementary transcendental functions. Because of results like this, many of
the definite integrals that are encountered in applications cannot be evaluated using the
Fundamental Theorem of Integral Calculus; instead, they must be approximated using nu-
merical techniques such as those we studied in Section 4.2. Of course, when antiderivatives
are available, the Fundamental Theorem is the best way to evaluate an integral. To this
end, we will investigate, in this section and in the next, techniques for evaluating definite
integrals by finding antiderivatives and applying the Fundamental Theorem.
    Before we begin, we need to introduce some additional notation and terminology.
First of all, we will call the collection of all antiderivatives of a given function f the general
antiderivative of f. For example, if f(x) = 3x2, then the general antiderivative of f is
given by F(x) = x3 + c, where c is an arbitrary constant.
    Second, since the Fundamental Theorem of Calculus draws a close connection between
antiderivatives and definite integrals, it is customary to borrow the notation for the general
antiderivative from the notation for the definite integral. Hence the general antiderivative
of a function f with respect to the variable x is denoted by

                                         f(x)dx.                                (4.4.1)

This is usually referred to as the indefinite integral of f with respect to x. Thus the terms
indefinite integral and general antiderivative are synonymous, and from this point on we
will prefer the former to the latter.
Example In this notation, we write

                                     3x2dx = x3 + ,

where c is assumed to be an arbitrary constant.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Using the Fundamental Theorem


Section 4.4


    Since finding an indefinite integral involves reversing the process of differentiation, we
can rewrite our basic results about derivatives in terms of indefinite integrals. Hence we
have the following list of integration formulas:


                fxd  =      +1 + c (where n # -1 is a rational number),    (4.4.2)

                J sin(x)dxz= - cos(x) + c,                                        (4.4.3)

                J cos(x)dx = sin(x) + c,                                          (4.4.4)

                J sec2(x)dx = tan(x) + c,                                         (4.4.5)

                J csc2(x)dx = - cot(x) + c,                                       (4.4.6)

                J sec(x) tan(x)dx = sec(x) + c,                                   (4.4.7)

                  csc(x) cot(x)dx = - csc(x) + c.                                 (4.4.8)

    Note that each one of these formulas may be verified by checking that the derivative
of the right-hand side is equal to the function inside the integral sign on the left-hand side.
Also notice that we have not used any special techniques to find these results; rather, we
know these formulas only because they are the inverses of differentiation formulas that we
learned in Chapter 3. Thus, for example, we know that


                               J sec2 (x)dxz= tan(x) + c,

but we do not know, nor do we even know how to begin to find, f sec(x)dx, which would
at first seem to be an easier problem.
    The following proposition is a consequence of the corresponding basic properties of
differentiation.
Proposition If the indefinite integrals of f and g exist, then


                       J (f (x) + g(x))dxz=    f (x)dx + f g(x)dx                 (4.4.9)

and
                        I(f (x) - g(x))dx   J f (x)dx - Jg(x)dx.                 (4.4.10)

Moreover, for any constant k,


                                 Jkf (x)dx   k f f(x)dx.                         (4.4.11)


﻿


Section 4.4


Using the Fundamental Theorem


3


Example Using (4.4.2) with the results of the previous proposition, we have


f(5x3 -3x+2)dx 5fx3 dx-3fxdx+2f ld x4


x2 +2x+c.
2


    It is worth noting that f ldx is typically denoted simply by f dx.
Example Using (4.4.2),


J 1


        t
t--dt=   1 +c=2  t+c.


Example Using (4.4.2), (4.4.4), and (4.4.9), we have


J (cos(x) + A) dx


f cos(x)dx +4 fx2dz
sin(x) - 4x-i1 + c


                                                 4
                                        sin(x) - - + c.


    Sometimes the indefinite integral of a function, although not itself in the list (4.4.2)
through (4.4.8), may be found with the use of some intelligent guessing. For example,
F(x) = sin(2x) is not an antiderivative of f (x) = cos(2x) since F'(x) = 2 cos(2x). However,
since F' and f differ only by a factor of 2, we can correct for this by dividing F by 2. That
is,
                              Scos(2x)dz = 1sin(2x) + c.

Again, as with all indefinite integrals, you may verify this result by differentiation.
Example     To find f 3sin(4x)dx, we might begin with a guess using F(x) = -3 cos(4x).
However, F'(x) = 12 sin(3x), which differs from the function we are integrating by a factor
of 4. Thus, dividing our initial guess by 4, we have


I


3 sin(4x)dx


4 cos(4x) + c.


Example To find f V2t + 3 dt, we might begin with a guess using


          (2t + 3)  23
   F(t) -(2t(+23)t.
              2


F'(t) = v2t+3+(2t + 3)=2  2t + 3,


However,


﻿


4                     Using the Fundamental Theorem                        Section 4.4

so we need to divide our guess by 2. Hence


                            f   2t+3 dt =    (2t+3)2 +c.


Example To find
                                    f     1
                                        3z+1      '
we might start with an initial guess of

                                    (3z+1)         3z
                            F(z)=       1     = 2 3z+1.
                                        2

Since
                                              3
                                  F'(z)  =         ,'
                                             3z +1
we find that
                                       dz    2 = 3z+1+c.
                               I 3z+1        3
Thus, for example,

                         /5 1          2              8  2
                     Jo    31     dz =-3z+1           -g2.
                         o 3z+1        3              3    3

    The common thread in the previous examples is the need to modify an initial guess
because of the chain rule. For example, F(x) = sin(2x) is not an antiderivative of f(x)
cos(2x) because the chain rule comes into play when differentiating F, resulting in an
extra factor of 2. This process of reversing the chain rule can be taken a step further to
help evaluate integrals in even more complicated situations. For example, consider the
indefinite integral
                                     2w 1 + w2 dx.

The key to evaluating this integral is recognizing that the factor 2x is the derivative of the
function inside the square root. That is, if we let


and
                                    g(w) =1 +wx2

then
                          f2w   1 + w2 dw    f (~)g(~z


﻿


Section 4.4


Using the Fundamental Theorem


5


Thus we are trying to find an antiderivative of a function which is in the form of the result
from a chain rule differentiation. Now if F is an antiderivative of f, then, using the chain
rule,
                        dF (g (x)) = F'(g (X))g' (X)= f (g (X)) g'(X).

Hence, thinking of u as 1 + x 2, we really only need to find the antiderivative of f with
respect to u. Now
                              ff(u)du   f vidu =3jU2+c,

so, substituting 1 + X2 back in for u, we should have

                          f 2x 1+x2 dx=3(1+x2)2 +c.

You should check this result by differentiation, noting in particular that the factor of 2x
comes from the use of the chain rule.
    In general, if F is an antiderivative of f and u = g(x) is some differentiable function
of x, then, by the chain rule,

                             JF(u) = F'(u)u = f (u) d.                          (4.4.12)
                             dod                       do
Writing this as an integration formula, we have


                         f f (u)   dx= F(u) + c        f(u)du.                  (4.4.13)

This technique to help find indefinite integrals is called integration by substitution.
Example To find
                                    f 2x sin(x2)dx,

we should let u = x2. Then
                                       du
                                       dx    2x
so, using (4.4.13) with f(u) = sin(u),

                              2xsin(x2)d  fsin(u)    dx

                                              fsin(u)du

                                            - cos(ii) + c
                                            - cos(x2) + c.


We summarize this technique as follows.


﻿


6


Using the Fundamental Theorem


Section 4.4


Integration by substitution To evaluate an indefinite integral of the form


I


f(g(x))g'(x)dx,


(4.4.14)


we may make the substitution u = g(x). We then have


f (g(x))g'(x)d


f (u)   dx


J (u)du.


(4.4.15)


    Of course, this technique will work only if we know an antiderivative for f. Indeed, all
we have done is replace one indefinite integral with another, with the hope that the new
integral will be simpler than the original. In our notation, we can think of the transition
from
                                   ff(g(x))g'(x)dx


to


f (u)du


as replacing g(x) by u and g'(x)dx by du. Thus in practice we often denote the process of
substitution by writing


U = g(x)
du = g'(x)dx


(4.4.16)


and directly substituting into the integral


f(g(x))g'(x)dx


to obtain the integral


I (u)du.


We will illustrate this in the next examples.
Example To evaluate the indefinite integral


   /2x
I 2 + X2 dx


   du


we may let u = 2 + x2. Then


which we write in the form


du= 2xdz.


﻿


Section 4.4

Substituting, we have


Using the Fundamental Theorem


7


     2x
     v2c dx
J 2 +x2


f ' 2ii l u+c=22 X2c


(4.4.17)


Example From (4.4.17), it is easy to see, after dividing through by 2, that


                           f     x    d
                             /   +x2dz =     2+x2 +c.

We could also see this directly when making the substitution. Namely, if we let u = 2+x2,
then du = 2xdx may be written as

                                    1
                                    - du = zdx.
                                    2

Hence, if we substitute 2+ x2 for u and j du for xdx, we obtain


J 2x  d
        dz
 v2+ X2 x


    1
f2 du
   u/-


1     1
2   V/du
2 J / i


1
-(2v/i)+c
2


2+x2+c.


Example To evaluate the indefinite integral


J 5x cos(x2)dx,


   u =2
   du = 2xdx.


we may make the substitution


Then


1
- du = xdx,
2


so we have


J 5x cos(x2)dx


5 J cos(u)du
2J


- sin(u) + c


sin(x2) + c.


Example To evaluate the indefinite integral


we may make the substitution


J tan(3x) sec2(3x)dx,


   i = tan(3x)
   du = 3 sec2(3x)dx.


﻿


8


Using the Fundamental Theorem


Section 4.4


Then


1
- du = sec2(3x)dx,
3


so we have


f tan(3x)sec2(3x)dxzudu


1 2
-11+ c
6


tan2(3x) + c.


Now suppose we want to evaluate the definite integral


b


If F is an antiderivative of f, then we know that


I


f(g(x))g'(x)d = F(g(x)) + c.


(4.4.18)


Hence


f (g(x))g'(x)dz  F(g(x))


F(g(b)) - F(g(a)).


(4.4.19)


Now we also have


                     / g(b)              g(b)
                          f (u)du =F(u) F()=(a)9a
                    Jg(a)                9( a)

Putting (4.4.19) and (4.4.20) together, we see that


(g(b)) - F(g(a)).


(4.4.20)


J b  f
  n f (g(x))g'(x)dz


g(b)
g(a)


f (u)du.


(4.4.21)


That is, similar to our work with indefinite integrals, we may evaluate the definite integral


f(g(x))g'(x)dz


U = g(x)
du = g'(x)dx,


by making a substitution


(4.4.22)


the only difference being that in the definite integral we must also change the limits of
integration. Note that the new limits of integration correspond to the range of values for
u given that x is ranging from a to b.


﻿


Section 4.4

Example To evaluate


we may make the substitution


Then


and u varies from


to


as x varies from 0 to 1, so


            [1  x2   dx
            Jo (1 +X3)2

Example To evaluate


Using the Fundamental Theorem


  /1(1+x3)2 dx>

     n = 1+ x3
     du = 3x2

     1
     - du = x2dx
     3

     1+03  1


     1+13=2


9


1 21
      du
3 J1 u2


31 12
3 u 1


11     1
6 36


0Jsin2 (x) cos(x)dx,


we may make the substitution
                           u = sin(x)
                           du = cos(x)dx.

Then u varies from 0 to 1 as x varies from 0 to 2, so

                /i1 1                      1  i
                ]sn2x) cos(x)d~ =  i2dui 2    3
                                   o 2 e0


Example To evaluate


/r
I2
0


cos' (x) sin(x)dx,


we may make the substitution


Then -du


                u = cos(x)
                du = - sin(x)dx.

- sin(x)dx and u varies from 1to 0 as x varies from 0 to 2 so


fn                  011cos3(x) sin(x)dx 2-3f223d26i2f4ii dii - 1
                        o 1 1


﻿


10


Using the Fundamental Theorem


Section 4.4


Example So far all of our examples of substitution have involved reversing the results
of the chain rule. However, substitutions can be useful in other situations as well. For
example, to evaluate


  r3

o


xv/1 + x dx,


the substitution


du = dx


turns out to be
Namely, since ui


useful for rearranging the integral into a form which can be evaluated.
= 1 + x implies that x = u - 1, we may substitute to obtain


/3
I x 1+xdx
0o


  4
  (u -1)v/u du


(5 i   -n1 i d
  11

(64 2)16

116
15


2)
3)


    We will continue the discussion of techniques for using the Fundamental Theorem of
Integral Calculus in Section 4.5.

Problems

1. Evaluate the following indefinite integrals.


   (a) f(x3 + 3x - 6)dx

   (c) J    dx

   (e) J     dt

   (g) f7 x+5dx

2. Evaluate the following indefinite integrals.

   (a) fsin(3x)dx

   (c)   v3t -1 dt


(b) J(3t2


- 4t + 5)dt


(d) J(3z       ) dz

(f)     2     dx

(h) J(sin(6) - 2 cos(O))dO


(b) fcos(4x)dx

(d)            dz
     J/1 + 5z


﻿


Section 4.4                   Using the Fun

    (e) f7sec2(2x)dx

    (g) f2csc2(7x)dx

 3. Evaluate the following indefinite integrals.

   (a) f6x1+3x2do

   (c) fx2(3+ax3)Odx

   (e) f4tsin(t2)dt

   (g) f sin3(t) cos(t)dt

 4. Evaluate the following indefinite integrals.

   (a) fsin( ) dx

   (c) f sec3(4x) tan(4x)dx

   (e)    sin(x) dx
       J cos2(x)

   (g) ft t-2dt

 5. Evaluate the following definite integrals.

    (a)   (4x2 - 3x - 5)dx

    (c) f3sin(2x)dx

    (e)     6 sec(3t) tan(3t)dt


    (g)   x   x2+1dx
 6. Evaluate the following definite integrals.

    (a) f1    5x2   dx
       J-1 (x3 + 2)2

    (c) f3x sin(x2)dx

    (e) fsin3 (2t) cos(2t)dt


idamental Theorem

  (f) f 3 sec(4x) tan(4x)dx

  (h) fsin(4x+1)dx


11


(b)

(d)

(f)

(h)


(b)

(d)

(f)

(h)


(b)

(d)

(f)

(h)


(b)

(d)


(f)


f 4x3 cos(x4)dx
      7     dx
    4+3x2
f 7z cos(3z2 + 1)dz

f4 cos4(3t) sin(3t)dt


f sec2(x) tan2(x)dx

f sin(O) cos(O)dO

f    cos (3t)  dt
1    1 + sin(3t)
f z dz
J   z + 1d


  3x+1

     2t -1dt
     1 d

     (7z +6)2 dz

f sin4 (t) cos(t)dt


  2 3x dx
Jo   x2+1  

f   cos2(t) sin(t)dt
f 5x(2 + x2)1dx


﻿


12


Using the Fundamental Theorem


Section 4.4


   (g) f-x(1 + x2)25dx
        /1-1
7. Evaluate the following definite integrals.

   (a) f     X+    dx
          Jo x+1

   (c)      tan4 (4x) sec2(4x)dx


   (e)    4x(1 + x)25dx


(h)      sec2(u)tan(u)du


(b) f       x+d
     (b x/2 + 1 d

(d)     cot (t) csc2(t)dt

    (f)  f sin(20) dO
    o  cos3(20)


8.
9.


(g) f   5uv2u - 1 du                        (h) f   sin5 (w) cos(w)dw

Find the area beneath one arch of the curve y = 4sin(6t).
(a) Plot the graph of y =sin2(x) cos(x) over the interval [0,7r].
(b) Find the area of the region beneath the graph of y =sin2(x) cos(x) over the interval
    [0, 7].
(c) Verify that


                             f  sin2(x) cos(x)dc

and justify your result geometrically.


0


﻿


Section 4.5


        Differential Equations         More Techniques of Integration


In the last section we saw how we could exploit our knowledge of the chain rule to develop
a technique for simplifying integrals using suitably chosen substitutions. In this section
we shall see how we can develop a second technique, called integration by parts, using the
product rule. Outside of algebraic manipulation and the use of various functional identities,
like the trigonometric identities, substitution and parts are the only basic techniques we
have available to us for simplifying the process of evaluating an integral.

Example Suppose we wish to find f x cos(x)dx. Since


                                  cos(x)dx = sin(x) + c,


we might make an initial guess of F(x) = x sin(x) for an antiderivative of f(x) = x cos(x).
But, of course, differentiation of F, using the product rule, yields

                               F'(x) = x cos(x) + sin(x),

which differs from the desired result, f(x), by the term sin(x). However, since


                                 sin(x)dx = - cos(x) + c,


we can obtain an antiderivative of f(x) by adding on the term cos(x) to F(x). That is,

                                G(x) = x sin(x) + cos(x)

is an antiderivative of f(x) since the derivative of cos(x) will cancel the sin(x) term in
F' (x). Explicitly,
                      G'(x) = x cos(x) + sin(x) - sin(x) = x cos(x).

Thus
                            x cos(x)dx = x sin(x) + cos(x) + c.


1I


Copyright @ by Dan Sloughter 2000


﻿


2


More Techniques of Integration


Section 4.5


In general, suppose f and g are differentiable functions and we want to evaluate


f (x)g'(x)dz.


(4.5.1)


For example, in our previous example we would have f(x) = x and g(x)
the product rule we know that

                          d
                          ~f(x)g(x) =f (x)g'(c) + g(x)f'(x).


sin(x). From


       (4.5.2)


Thus, integrating both sides of (4.5.2), we have


J f (X)g(x)


f (x)g'(x)dx +  g(x)f'(x)dx,


(4.5.3)


from which it follows that


f (x)g(x) =   f (x)g'(x)dx + f g(x)f'(x)dx.


(4.5.4)


Rearranging (4.5.4) gives us


J (x)g'(x)dx = f (x)g(x) - Jg(x)f'(x)dx.


(4.5.5)


Applying (4.5.5) to our example, with f(x) = x and g(x) = sin(x), we have


J xccos(x)dz = x sin(x)


J sin(x)dx = x sin(x) + cos(x) + c.


In effect, using (4.5.5), we have replaced the problem of evaluating f x cos(x)dx with the
simpler problem of evaluating f sin(x)dx. In general, the success of this method always
depends on the integral


If


g(x)f'(x)dx


(4.5.6)


being easier to evaluate than the integral


I.


f(x)g'(x)dx.


(4.5.7)


    It is common with this technique to let u = f(x) and v = g(x) along with the notation,
as we did with substitution,


and


dv = g'(x)dx


du= f'(x)dc.


(4.5.8)


(4.5.9)


﻿


Section 4.5


More Techniques of Integration


3


With this notation, (4.5.5) becomes

                               Jude = uv - fvdu,                            (4.5.10)

the standard form for what is known as integration by parts.
Example     To evaluate the integral f x sin(x) by parts, we must first make a choice for u
and dv. Here we might choose

                               u = x dv = sin(x )dx.

It follows then that du = dc. However, there are many possible choices for v; all that
we require is that the derivative of v must be sin(x). The simplest choice is to take
v   - cos(x). Then we have, applying (4.5.10),

           f x sin(x)d = -x cos(x) + f cos(x)d = -x cos(x) + sin(x) + c.


Example     To evaluate the integral f x2 cos(2x)dx, we might choose

                              u = x2 dv = cos(2x)dx,

from which we obtain
                             du = 2xdx v = - sin(2x).
                                             2
Thus
                    fx2 cos(2x)dv =cx2 sin(2x) -] cv sin(2x)d.

This time we do not immediately know the value of the integral on the right, but we know
we can find it using integration by parts. Namely, to evaluate f x sin(2x)dx, we let

                              u= x     dv=sin(2x)dx
                                             1
                             du= dx v =- cos(2x).
                                             2
Then

     fcsin(2x)d= --xcos(2x) + -Icos(2x)d= --vxcos(2x) + -sin(2x) + c.
                       2           2 J2                           4

Hence
                    x2 os(x~d =-cv sin(2cv) + -cv cos(2cv) - - sin(2cv) + c.
                               2 2 4

    The key to success with integration by parts is in the choice of the parts, ii and dv.
For example, we saw in an example that the choices

                               ii  c    dv   sin(xvjdx
                               duidx    v=-cos(xv)


﻿


4                     More Techniques of Integration                     Section 4.5

work well for evaluating f x sin(x)dw. Alternatively, we could have chosen

                                u = sin(x)   dv = xdx
                                                  12
                              du = cos(x)dx v = -x2

which would yield


                     f x sin(x)dx =x2 sin(x) -  fx2 cos(x)d.


All of this is correct, but useless (at least for our present purpose) since the resulting
integral on the right is more complicated than the integral with which we started. If we
had started to work the problem this way, we would probably stop at this point and rethink
our strategy.
Example In using integration by parts to evaluate a definite integral, we must remember
to evaluate all the pieces of the resulting antiderivative. For example, to evaluate


                                      4x cos(3x)dx,
                                  f

we might choose
                               u= 4x    dv=cos(3x)dx
                                             1
                             du = 4dx v = -sin(3x ).
                                             3
Then
                      S4x cos(3x)d=  - sin(3x)   - -  sin(3x)d
                    4c co(3coc                     3 Jo
                                              4
                                     (0 - 0) + - cos(3x)
                                              9        o
                                      4    4
                                      9    9
                                      8


Example     Although integration by parts is most frequently of use when integrating func-
tions involving transcendental functions, such as the trigonometric functions, there are
other times when the technique may be used. For example, to compute

                                    f c(1 + cc)0d,


we could use
                              u =cx dov=(1 +c)10dc


               1
du =dxc v = (1 +cx)1.
              11


﻿


Section 4.5


Then


More Techniques of Integration


   1            1   11      +lld
= -x(1 +x)"1              (1+)d
   11           0~ 1110


5


f1x(1 + x)10dx
10


2048     1  1     12
11      1  (1 + )
2048     4096     1
11        132    132
6827
44


1

0

0


Notice that we could also evaluate this integral using the substitution  = 1+ x.

Miscellaneous examples
The techniques of substitution and parts are often useful for putting an integral into a form
that can be readily evaluated by the Fundamental Theorem of Integral Calculus. The next
several examples illustrate how basic trigonometric identities are also useful for rewriting
integrals in more easily evaluated forms.

Example To evaluate f sin2(x)dx, we may use the identity


          1 - cos(2x)
sin2x) =  2


(4.5.11)


(see Problem 5, Section 2.2). Then


1                     2


-sin(2x) + c.


Example Similarly, to evaluate


I O


cos2(2t)dt,


we use the identity


Then


cos2(x) - 1 + cos(2x)
               2


(4.5.12)


   cos2(2t)dt
0


f  (1 + cos(4t))dt
2 0
    1   1       T
-t + - sin(4t)
20 8            0

8


﻿


6                      More Techniques of Integration                       Section 4.5

Example     To evaluate f sin2(x) cos2(x)dx, the identity

                                               1
                                sin(x) cos(x) =- sin(2x)                        (4.5.13)
                                               2

is useful (see Problem 4, Section 2.2.). From it we obtain

                       fsin2(x) cos2(x)dx      (sin(x) cos(x))2dx

                                             J (~sin(2x)) dx

                                             =fsin2(2x)dx

                                             SJ(1 - cos(4x))dx
                                             1 1
                                             8x -     sin(4x) + c.

Example To evaluate f sin3(x)dx we may use the identity

                                  sin2(x) = 1 - cos2(x)

to write
                     sin3(x) = sin2(x) sin(x) = (1 - cos2(x)) sin(x).
Then the substitution
                                     u = cos(x)
                                     du = - sin(x )dx
gives us

                        f sin3(x)dz      (1 - cos2(x)) sin(x)dz

                                       - (1- u2)du
                                             13

                                                  1
                                      - cos() +-cosa (c) +c.

This manipulation is useful in evaluating any integral of the form f sin"h(cc)dcc or, in a
similar fashion, f cosm(cc)dcc, provided nt is a positive odd integer.
Example As a final example, note that the identity


tan2(x) = sec2(x) - 1


﻿


Section 4.5


More Techniques of Integration


7


(see Problem 3, Section 2.2) is useful in evaluating


                                         tan2(x)dx.


Namely,

                              tan2(x)dx =   (sec2(x) - 1)dz
                            0                0     T     T
                                         = tan(x)    - x
                                                  0      0
                                                7T
                                          =1- -.
                                                4

    This concludes our discussion of techniques of integration. As we noted above, there
are basically only two techniques for evaluating indefinite integrals, substitution and parts,
and even these rely on an ability to reduce a given integral to a form where an antiderivative
is recognizable. Hence the situation is not nearly as straightforward as it was for finding
derivatives and best affine approximations. For this reason, in the past tables of indefinite
integrals were compiled to aid in the evaluation of integrals; when faced with an integral
more involved than the basic ones we have investigated in these last two sections, one
could hope to find it, or one related to it through a substitution or an integration by
parts, in a table. For the most part, tables of integrals have been replaced by computer
programs, such as computer algebra systems, which are capable of finding antiderivatives
symbolically. Such programs are then able to evaluate definite integrals exactly using
the Fundamental Theorem. Although these programs are immensely useful and are an
everyday tool for those working with applications of mathematics, one must use them with
care. In particular, whenever possible, you should check your answer for reasonableness.
Moreover, there are integrals which the system will not be able to evaluate symbolically,
either because the given integral is beyond the capabilities of the system, or because a
symbolic answer does not even exist. In such cases, one must, of necessity, fall back on
numerical approximation techniques.

Problems

1. Evaluate the following indefinite integrals.

    (a) f3csin(x)d                              (b)   2xcos(5)dx


    (c f/4w sin(3w)dz                           (d) fxc2 cos(3cc)dzc

    (e) f2c2 sin(4c)dzc                         (f) fw3cos(x)dz


    (g f/3) sin(2xc)dzc                         (h) f   v/1 + cc dcc


﻿

8                     More Techniques of Integration

2. Evaluate the following definite integrals.

    (a)    4x sin(x)dx                       (b)f   3x cos(2x)dx

    (c) f32t sin(3t)dt                      (d) fox2 cos(x)dx

    (e) f  2x2 sin(2x)dx                     (f)    z3 cos(4z)dz
 3. Evaluate the following indefinite integrals.
    (a) fsin2(2x)dz                          (b) fcos2(3t)dt

    (c) f 5 sin2(2t) cos2(2t)dt             (d) f sin3 (3x)dx

    (e) f6cos3(2z)dz                         (f) fsin5 (t)dt

    (g) fcos(2x)dz                          (h) ftan2(3)dO
 4. Evaluate the following definite integrals.

    (a) fsin2(x)dz                           (b) f4cos2(2t)dt
        0                                        0
    (c) f23sin2(z) cos2(z)dz                 (d)     cos3(t)dt

    (e) ]  sin3(3t)dt                        (f)     tan2(2t)dt
                  0- s
 5. Evaluate the following integrals using a computer algebra system.
    (a) fcos(x)dz                           (b) fJsin2(2t) cos4(2t)dt

    (c) fsin(2t)cos4(3t)dt                  (d) fsec4(3x)dz

    (e) f    1 - 2 do                        (f) fc2 1-c2 do

    (g) f2Fsin8 (2t)dt                       (h) f  tan6 dt

 6. Evaluate the following integrals with any method at your disposal.

    (a) j  sin4 (x)dx                        (b) [in(x) do

    (c) f  sin(3x2)dx                       (d) f     1 + cc2 dcc


Section 4.5


﻿


Section 4.5


More Techniques of Integration


9


   (e) j   1 dx                                   j(f1)  dt

   (g) f 5- 3 sin2(t) dt                      (h)                   dt
                                                      o1 + sin2 (t)
7. If a pendulum of length b is held, at rest, at an angle a from the perpendicular,
   0 < a < r, and then released, its period T, the time required for one complete
   oscillation, is given by

                                  b [2
                                            1 - k2 sin2(P)

   where g = 980 cm/sec2 (the acceleration due to gravity) and k = sin (v).
   (a) Find the period of a pendulum of length 50 centimeters which is released initially
       from an angle of a = j
   (b) Repeat (a) for a =4 }, , 6,and l.
   (c) In Section 2.2 we noted that for small values of a, if x(t) represents the angle the
      pendulum makes with the perpendicular at time t, then, to a good approximation,

                                    x(t) = a cos( it).


       Thus, in this approximation, x has a period of 27   . For a pendulum of length
       50 centimeters, compare this result with your results in parts (a) and (b).
   (d) For a pendulum of length 50 centimeters, graph T as a function of a for - 4


﻿


Section 4.6


       Differential Equations          Improper Integrals


In this section we will make two extensions to our definition of the definite integral. The
first will cover integrals of functions over intervals of the form [a, o] and (-oc, b], where a
and b are fixed real numbers, as well as the interval (-oc, oc), while the second will cover
integrals of functions which have infinite discontinuities. An integral of either one of these
two types is called an improper integral.


1

0.8

0.6

0.4

0.2


Rb


1      2       3    b  4      5       6


Figure 4.6.1 Area of Rb approaches area under y


1
: 2


as b increases


    First, consider a function f defined on an interval [a, oc) with the property that f is
integrable on every interval [a, b] with a < b < oc. For example, the function

                                              1
                                      f(x) - 2


is defined for all x in [1, oc) and, since it is continuous on [1, oc), is integrable on any
interval [1, b] with 1 < b < oc. If we let Rb be the region beneath the graph of f over the
interval [1, b] and we let R be the region beneath the graph of f over the interval [1, oC),
then we would expect that the area of Rb would approach the area of R in the limit as b
goes to infinity (see Figure 4.6.1). In terms of integrals, this is saying that it would seem
reasonable to define


/ 1
S-dx
J1 x2


       b 1
 lim j-do.
b->oo   x 2


1


Copyright @ by Dan Sloughter 2000


﻿


2


2    ~~Improper Integrals                       Seto     .


Section 4.6


That is, we should have
                                    2 dx =lm         2d
                              j    x2       booJ1 x2

                                         = lim --
                                           b-oo x


                                           b-oo b)


Geometrically, this result says that R has finite area, namely, 1, even though it has infinite
length.
    We now state a general definition for this type of integral.
Definition If f is defined on [a, oo) and integrable on [a, b] for all a < b < oc, then we
define

                                f f(x)dx    limo  f (~x                           (4.6.1)

provided the limit exists. Similarly, if f is defined on (-oo, b] and integrable on [a, b] for
all -oc < a < b, then we define

                               fb(X)dz = lim        f(x)dx,                       (4.6.2)
                                 /b Ja

provided the limit exists. Finally, if f is defined on (-oo, oo) and integrable on any finite
interval [a, b], then we define

                        / 00             0             oc0
                          ] f (x)dx    ]   f (x)dx        f (x)dx,                (4.6.3)

provided both of the integrals on the right exist. In each case where the appropriate limit
exists, we say the integral converges; otherwise, the integral is said to diverge.
    Note that the use of 0 in (4.6.3) is not crucial; all that is important is that the integral
is broken into two pieces, the meaning of each of the pieces already having been covered
in the earlier parts of the definition.

Example The integral
                                               dx

converges, since


                                 3 x      b-oo]3 I d

                                       =-lim-b
                                          b-moo 2x2 3


﻿


Section 4.6


Improper Integrals


3


0.04


0.03


0.02


0.01


2


6       8       10


4


Figure 4.6.2 Region beneath y


                     = lim  -
                     b-oo(
                       1
                       18


1
X3


beginning at 3


   1
+ 1


1
2b2


See Figure 4.6.2.


Example


The integral


  /'° 1
2       dX


diverges, since
                      b 1
                b-ioj' dx
                bS or 2 N4I
See Figure 4.6.3.


          b
 lim 2/ X
b-moo     2


lim (2 b
boo


2  2) = cc.


0.8


0.6


0.4


0.2


10         20         30


40


Figure 4.6.3 Region beneath y


1
Nx


beginning at 2


﻿


4


4 ~Improper IntegralsSeto4.


Section 4.6


-3    -2.5


-0.5

     -0.5


     -1


     -1.5


_2


Figure 4.6.4 Region above y


2
5 ending at


-1


Example


The integral


I_


converges, since


-'2
   5dcc


urn
--00Ja


  J 1d2
-00 5d2


12
  c5 dcc


lirn -  2
a- -oo 4xc
           a


  1
  2


-1


2a4)


See Figure 4.6.4.


Example


The integral


I 00
  -00


(1cc)2dxc


converges, since


x 0
f00l~c2)2dcc


   J-X+ cX2)2  J

a-imoj (l+c2)2 dcc+


(1+c2   dcc


  m     (1 + c22 dxc


           11 b
a-1i 0 2(1+cc2) a +lrn
                      b 00 2(1  +c X2) o1 i-2 + 2 l ab   - 1

   + 2oo(1+2))+  -o(2(1 +b2


﻿


Section 4.6


Improper Integrals


5


0.4
0.3
0.2
0.1


-4 :


4


2


-0.4


Figure 4.6.5 Region between y


        2 and the x-axis
(1 +X2)2


  1   1
  2   2
0.


Note that you could use the substitution u1= 1 + x2 to help evaluate the integral in this
example. See Figure 4.6.5.

    It is frequently important to know that an integral fc f(x)d converges even if we
cannot compute its value exactly. For example, before trying to find numerical approxi-
mations for such an integral one should first check that it converges. We will first consider
the following situation: Suppose f and g are defined on [a, oc), integrable on [a, b] for all
a < b < oc, and 0 < f(x) < g(cc) for all x in [a, oc). Moreover, suppose we know that
f7g(x)d converges. Let


M = fg(x)dx,


G(b) = fg(x)dx,


       Fb
F(b)= a f (x jdx


(4.6.4)


(4.6.5)


(4.6.6)


and


for all b > a. Now for any b > a,


M = fg(x)d


b cc                              ob
  g(x)dz + g(x)dz = G(b) + g(x)dz.
a           b                    b


(4.6.7)


Since g(x) > 0 for all x > a,


   g(x)d    0.
b


(4.6.8)


﻿


6


Improper Integrals


Section 4.6


Thus (4.6.7) implies that


G(b) = M


f   g(x)dz    M


(4.6.9)


for all b > a. Moreover, f (x) g(x) for all x> a, so

                        F(b)   f  f(a )     fg(x)dz =G(b)

for all b > a. Putting (4.6.9) and (4.6.10) together, we have F(b)
Furthermore, for any c > b > a,

       F(c)   f f(x)dzc    I             C +          f f(cc)dzb

where we know
                                   ff(x)dz  0


(4.6.10)


< M for all b > a.


F(b),       (4.6.11)


            (4.6.12)


because f (x) > 0 for all x > a. From (4.6.11) we conclude that F is a nondecreasing
function. Since we already know that F is bounded by M, it follows from our result about
bounded nondecreasing sequences in Section 1.2 that


lim  F(b) = lim     f (x)dc
b~ox        boxoJa


(4.6.13)


exists. That is,
                                 /wI b
                                     f~cc~dcc b-* f (c)dcc
                             a            bcao
converges. Moreover, since F(b) < M for all b>2 a,


(4.6.14)


f (x) dx


lim F(b) < M
boo


00g(x)d.
/wg(c x


(4.6.15)


    On the other hand, suppose f and g are defined on [a, oc), integrable on [a, b] for all
a < b < oc, 0   f(x)   g(am) for all x in [a, oc), and fc7° f(x)dz diverges. If we define F
and G as above, then F(b) is nondecreasing and without a limit as b increases toward oo.
Hence it follows, again from our results in Section 1.2, that we must have


lim F(b) =00.
b-oo


(4.6.16)


Since, as above, G(b) > F(b) for all b > a, (4.6.16) implies that


lim     g(x)dz
box ab


lim G(b) =00.
b- oo


(4.6.17)


In particular, fa7° g(x)dz diverges.


﻿


Section 4.6


Improper Integrals


7


    We summarize the previous results in the next proposition.
Proposition Suppose f and g are defined on [a, oc), integrable on [a, b] for all a < b < oc,
and 0   f(x)   g(am) for all x in [a, oc). If ff7 g(x)dz converges, then ff7 f(x)dz converges
and
                            0< f    f(x)dz    f g(x)dx.                       (4.6.18)

If fc;° f(x)dz diverges, then ff7 g(x)dz diverges.
    Similar results hold for integrals on intervals of the form (-oc, b] and (-oc, oo).
Example At present we cannot use the Fundamental Theorem to evaluate

                                    f/     1
                                         J 1+x2
because we do not know an antiderivative for
                                              1
                                    f W) =      2
                                           1+ x2

(although we will find one in Section 6.5). However, since x2 < 1 + cX2 for all values of x,
we know that
                                         1       1
                                  0 < 1+ X2     cc2

for all x > 0. Now we saw above that

                                       /cc 1
                                    1   2     =1
so we know, by the previous proposition, that


                                              2dz
                                    1i 1+xc2

converges with

                                      1+     dc < 1.                          (4.6.19)
Moreover, 1 + X2 ;> 1 for all x, so
                                        1
                                            <1
                                     1+12 -


                              f   1        1+(.620

Putting (4.6.19) and (4.6.20) together, we have

                 f ~     dz=        1    dz+             doc 1+1=2.
              Jol1+c2aw       o+c2w            1  1+cc2


﻿


8                            Improper Integrals                            Section 4.6

In the problems for Section 6.5, you will be asked to show that


                                  Iol+c2 dcc 2

Example     Since   cc - 1 <  cc for all x > 0, we see that

                                        1       1
                                  0-<-<


for all x> 1. Thus, by the previous proposition,

                                        P0 1
                                        I  ~    doc


diverges since we saw above that
                                         J01
                                            ccdo

diverges.
    Although we will not go into the details, the previous proposition may be generalized
as follows.
Proposition     Suppose h(x)    f(cx)  g(cc) for all x in an interval [a, oo) and f, g, and
h are integrable on [a, b] for all a < b < oc. If both fc7 h(x)d and fc7O g(x)d converge,
then fc7 f(x)d converges as well. Moreover, in that case,


                         a0h(x)dcc        f(x)d c       g(x)d.                 (4.6.21)

    Note that our previous proposition is a special case of this proposition with h(x) = 0
for all x > a. As before, similar results hold for integrals on intervals of the form (-oc, b]
and (-oo,oo).
Example     Since -1 G sin(x)    1 for all x, it follows that

                                   -1    sin(cc)   1


for all cc> 1. Moreover,
                                     f/   1dc1

and
                             f0 ~2doc =     f     do = -1.


﻿


Section 4.6                             Improper Integrals                             9

Hence it follows that
                                          sin(x)
                                          1 2

converges and
                                -1  f<°sin(x) dx < 1.


    After noticing that for any function f, -|f(x)| < f(x) < If(x)| for all values of x, the
following proposition is a special case of the previous proposition.

Proposition If f is defined on [a, oo) and fa7° f(x)|dx converges, then fc7° f(x)dx con-
verges.

Example Another way to see that

                                        /°sin(x) dz


converges is to note that
                                       /°sin(x) dz
                                          1 2

converges since

                              0 <sin(x)      |sin(x)|    1

for all x > 1.

    Once again, similar results hold for integrals on intervals of the form (-oc, b] and
(-oC, oo).
    We now consider another extension to our definition of the definite integral. Suppose
the function f is defined on the interval (a, b] with


                                     lim | f (x)|I = 00.


If f is integrable on every interval of the form [c, b] with a < c < b, then we may, analogous
to our earlier definitions, define


                                f f(x)dx    lim ]f (x)dx,                        (4.6.22)
                                Jaa


provided the limit exists. See Figure 4.6.6.


﻿


10


Improper Integrals


Section 4.6


                              a c                              b

           Figure 4.6.6 Area over (a, b] is the area over [c, b] as c approaches a

Definition If f is defined on the interval (a, b] with

                                     lim  f(x)| =   ,
                                     xma+
and is integrable on every interval of the form [c, b] with a < c < b, then we define

                               Ib                  b
                               f f(x)dx =urn ] bf (x) dxc,(4.6.23)
                               Ja          ca+
provided the limit exists. Similarly, if f is defined on the interval [a, b) with

                                     lim|f (x)| = 0,

and is integrable on every interval of the form [a, c] with a < c < b, then we define

                              f f(x)dxz     lim ]cf(x)dx,                        (4.6.24)
                              a            cab-
provided the limit exists. Finally, if f is defined on [a, d) and (d, b] with either

                                     lim|f (x)| = 0,

or
                                     lim|f (x)| = 0,

or both, and f is integrable on all intervals of the form [a, c] with a < c < d and of the
form [c, b] with d < c < b, then we define
                            bdf (x)d   f   f(x)dx +     f (x)dx,                 (4.6.25)


provided both the integrals on the right exist. In each case where the appropriate limit
exists, we say the integral converges; otherwise, the integral is said to diverge.


﻿


Section 4.6


Improper Integrals


11


10

8

6

4

2


0.2      0.4      0.6     0.8


1


Figure 4.6.7 Region beneath y


1
   over [0, 1]
Vx


Example


The integral


dz


converges since


Jo 1 1


       11
 lim      1dx
c-0+ c 1/


          1
 lim 2
c-0+      c


lim (2
c-0+


2v/c) =2.


See Figure 4.6.7.


Example


The integral


      dz
  1
10x2 d


diverges since


     dx
OX21


      11
 lim     1 dx
c-0+     x2


       1
 lim --
c-0+   x


lim -
c-0+


DC.


See Figure 4.6.8.


Example


The integral


is improper since


and


  /2L(x-)s

         1
 lim         2
 x-1- (x -

         1
 lim (
x-1+ (x - 1)3


dx


o


DC.


﻿


12


12    ~Improper IntegralsSeto4.


Section 4.6


50

40

30

20

10


1


0.2    0.4    0.6    0.8


Figure 4.6.8 Region beneath y:


2over [0, 11


Moreover, the integral converges since


I l
0o


   1d
     2 d
(x - 1)


lim ~j      2 dx
      0(x-1) 3

 urn 3(x - 1)>
   c  1     0


3


     I2   1
 lrn         2dx
 c + c(x-1) 3
            2
 urn 3(x-1) 3
   c---  1+c


and


If2


   1
(x-1)3d


urn (3

3,


3(c - 1>))


which together imply that


I2   1    d
0(x -1) d


ii       d              dx. x-1) l


See Figure 4.6.9.


﻿


Section 4.6


Improper Integrals


13


10

8

6

4

2


0.5


1


1.5


2


Figure 4.6.9 Region beneath y


   1
      2 over [0, 2]
(x - 1)3


Problems

1. Evaluate the following integrals.

    (a) j3° dx


    (c) j  54 dx


    (e)     (2x+ 3)2 dx
 2. Evaluate the following integrals.

    (a) j    2 dx


    (c)J           dx

 3. For each of the following, decide, without
    or diverges.

    (a) f         dx
       /*x3+2

    (c) f1            dz
       2(z2 - 2)1/3

    (e)    sin3(t) dt
    fe°1      t2  dt


(b)      y dx


(d) f      + 1 dx


(f) f   sin(x)dx


(b) f c     +(x2+4)4 dx


(d)f        5t    dt
      _c /t2+1
evaluating, whether the integral converges

          *1
 (b)J      2+5 dx


 (d)             dt

 (f) j    cos(z)
    (f) &z) dz


﻿


14                         Improper Integrals                       Section 4.6

4. Evaluate the following integrals.

   (a) f   1  dx                           (b)J   4 dx
         0Xs                                   0 X

   (c) f     1   dx                        (d) f o   52 dt
        0 v1-x                                 0(t -2)5


   (e) J-20   222 dz                       (f) f  i2 dx
        2(z + 2) - X
 5. (a) Show that

                                      JjPdx

       converges for p > 1. Find its value.
   (b) Show that

                                      JjPdx

       diverges for p < 1.

 6. (a) Show that

                                      f1Xdx

       converges for p < 1. Find its value.
   (b) Show that

                                      f      dx

       diverges for p > 1.

 7. Let
                                      1   1        1
                                      2 3n
   for n = 1, 2, 3,.... That is, sn is the nth partial sum of the harmonic series (see Section
   1.3).

   (a) Show that

                                   sa ;1+ 5 'dx
                                           J1 x
       for nr= 1, 2, 3,..(Hint: Use the right-hand rule to approximate the integral.)
   (b) Show that
                                         j dx


diverges.


﻿


Section 4.6


Improper Integrals


15


   (c) Use a geometric argument to conclude that

                                              - dz
                                          jl1

       also diverges.
8. For constants 6u> 0 and a > 0, the function

                                              Xa+l
   where x ;> u, is called a Pareto distribution. It is often used in modeling the distribution
   of incomes or wealth in a population. In the income interpretation, the function

                                   P(x) J=  P(t)dt,

   x ;> u, gives the proportion of the population whose income exceeds x. Here u repre-
   sents the minimum income of any person in the population and a controls how rapidly
   the income distribution diminishes as x increases.
   (a) Find P(x).
   (b) If a > 1, the average income of a population described by this model is

                                      A f xp(x)dx.

       Find A.
   (c) Why is the condition a > 1 needed in (b)?
   (d) Suppose o7= 10,000 and a = 1.2. Find A, P(A), and P(2A). Interpret the
       meaning of these values.
   (e) Find the general expression for P(A) as a function of a and graph it. Use this
       graph to interpret the fairness of the income distribution for different values of a.
9. If f is integrable on [-b, b] for all b > 0 and

                                           Ib
                                        Lb J f(x)dx

   exists, then we call
                                                b
                                 1(f)     m      f (x)dx

   the C;nnachy integral of f.


   (b) Find I(f) and I(g) for f(x) =x and g(x) =sin(x).
   (c) Show that the Cauchy integral of f may exist even though f_ f(x)dx diverges.


﻿


Section 4.7


       Differential Equations         More on Area


In Section 4.1 we motivated the definition of the definite integral with the idea of finding
the area of a region in the plane. However, to solve the problem we restricted to a very
special type of region, namely, a region lying between the graph of a function f and an
interval on the x-axis. We will now consider the more general problem of the area of a
region lying between the graphs of two functions.


y = g(x)


a


b


Figure 4.7.1 Approximating the area between y


f(x) and y


g(x)


    Suppose f and g are functions defined on an interval [a, b] with g(x)
in [a, b]. We suppose that f and g are integrable on [a, b], from which it
function k defined by


< f(x)
follows


for all x
that the


                                  k(x) = f (x) - g(x)
is also integrable on [a, b]. Let R be the region lying between the graphs of f and g over
the interval [a, b] and let A be the area of R. In other words, A is the area of the region of
the plane bounded by the curves y = f(x), y = g(x), x = a, and x = b. We begin with an
approximation for A. First, we divide [a, b] into n intervals of equal length


Ax


b - a
  n


and let a = xo < x1 < x2 < x3 < .   < xn = b be the endpoints of these intervals. Next,
for i = 1, 2, 3, ... , n, let RZ be the region lying between the graphs of f and g over the


1


Copyright @ by Dan Sloughter 2000


﻿


2


More on Area


Section 4.7


         Figure 4.7.2 Region bounded by the graphs of y = 2 - x2 and y = x2


interval [xi_1, ci]. If Ai is the area of R2, then

                                        n
                                   A    LA2.                               (4.7.1)
                                       i=1

Now f(x2) - g(ci) is the distance between the graphs of f and g at xi, and so

                                (f (x2) - g(ci))Ax

should approximate Ai reasonably well when Ax is small. Thus
                         n                     n
                           (f (ci) - g(ci))Ax =3k(2)Acx                    (4.7.2)
                        i=1                   i=1

will approximate A. Moreover, we should expect that this approximation will improve as
Ac decreases, that is, as n increases, and so we should have
                                        n
                              A = lim     k(ci)Ac.                         (4.7.3)
                                       i=1

But now the right-hand side of (4.7.2) is a Riemann sum, in particular, the right-hand rule
sum, and so the right-hand side of (4.7.3) converges to the definite integral of k on [a, b].
Hence we have
                            Ib          Ibfcc
                       A =     k(x)d =     (f(x) - g(x))d.                 (4.7.4)
                            JaJa

Example Let R be the region bounded by the curves y = 2 - x2 and y = x2, as shown
in Figure 4.7.2. Note that these curves intersect when


2-  .T2 _ 2


﻿


Section 4.7


More on Area


3


                     3


                     1   R


                         -12                   3        4       5
                    -2


          Figure 4.7.3 Region bounded by the graphs of x = y2 and x = y + 2


which implies that 2x2 = 2, that is, x= -1 or x = 1. Hence the two curves intersect at
(-1, 1) and (1, 1), and so we may describe R as the region between the curves y = 2 - x2
and y = x2 which lies above the interval [-1, 1]. Thus if A is the area of R, we have


                             A   f((2 - x2) - x2)dz


                                    (2 - 2X2)dz


                                 (2x -    3)3

                                      2 2
                                  2 --) - (-2+)

                                  8
                                  3


Example Let R be the region bounded by the curves x = y2 and x = y + 2. These two
curves intersect when

                                    y2 =:y + 2,
which implies that


Hence the two curves intersect when y =-1 and y =2, that is, at the points (1, -1) and
(4, 2). However, looking at Figure 4.7.3, we see that not all of R lies over the interval [1, 4].
In fact, R may be broken up into two regions, Riand R2, where R1 is the region between
the curves y - cc and y = -cc over the interval [0, 1] and R2 is the region between the
curves y - cc and y =cc - 2 over the interval [1, 4]. Thus, if A is the area of R, A1 is the


﻿


4


More on Area


Section 4.7


area of R1, and A2 is the area of R2, then


                     A=A1 +A2

                         / 1                   4
                            +                (    x-(x-2))dx


                          0     1   (  21)7
                          J             i4 3 23 1  


                          4    l16       \    [2    1    )
                          3+ 3 -8+8  --2+2

                          9
                          2


    The region R in the previous example may also be described as the region lying between
the curves x = y2 and x = y+2 over the interval [-1, 2] on the y-axis. In general, analogous
to our development above, if f and g are functions defined on an interval [c, d] on the y-axis
with g(y) < f(y) for all y in [c, d], then the area A of the region bounded by x = f(y),
x = g(y), y = c, and y = d (see Figure 4.7.4), is given by


                               A = (f(y) - g(y))dy.                            (4.7.5)


                         d


                           x = g(y)


                               c(Y


i


Figure 4.7.4 Region between the curves x = f(y) and x = g(y)


﻿


Section 4.7


More on Area


5


3


2.5


2


1.5


1


1.5     2


1    -0.5


-0.5


          Figure 4.7.5 Region bounded by the graphs of y = x3


In particular, for our previous example we have


                          A       (y+2 - y2)dy
                                /1


x and y =x2


2


13 21
  1         1
  y 3
3// -1


(2+4
9


8
3J


In this case, the second method for solving the problem is a little simpler than the first; in
general, it is often useful to look at a problem both ways and evaluate using the simpler
of the two approaches.
Example Let R be the region bounded by the curves y = x3 - x and y = x2. These
curves intersect when
                                      3 _ _ 2

that is, when
                           O = x3 - x2 - x = x(x2 - x - 1).

Hence the curves intersect when x = 0,

                                         1 - v5
                                            2
or
                                         1+5
                                            2   '
where the latter two values were found using the quadratic formula. From the graphs in
Figure 4.7.5, we see that R may be divided into two regions, R1 and R2, where R1 extends


﻿


6


More on Area


Section 4.7


from x 1V to x = 0 and R2 extends from x = 0 to x = 1+2 .Note that in R1 we
have x3 - x > x2, whereas x2 > x3 - x in R2. Thus, if A is the area of R, A1 is the area
of R1, and A2 is the area of R2, then
             A=A1+A2
                    0                       1+2
                    f(zc  - x -2)dz +f      (xc2-_c3+xc)dz
                    2


                      _0 1+
                    1 4   1 2    1 3            1 3   1 4    1    2z
                  = 4x    2x     3   3 v+       3x    4x     2x
                                         2
                  13


Problems

1. Find the area of the region bounded by the curves y = x and y = x2.
                                                                     1
 2. Find the area of the region bounded by the curves y = wx2 and y =x+.

 3. Find the area of the region bounded by the curves y 9  and y cc +2.
 4. Find the area of the region bounded by the curves y =cs(x) and y = x2.
 5. Find the area of the region bounded by the curves y =sin(cc) and y =9x.
 6. Find the area of one of the regions lying between the curves y = cos(x) and y = sin(x)
    between two consecutive points of intersection.
 7. Find the area of the region in the first quadrant bounded by y = cos(x), y = sin(x),
    and x =0.
 8. Find the area of one of the regions lying between the curves y =cos2(x) and y =sin2(x)
    between two consecutive points of intersection.
 9. Let R be the region bounded by the curves y = x2 and y = 2 - x.
    (a) Set up an integral to find the area of R using functions of x.
    (b) Set up an integral to find the area of R using functions of y.
    (c) Evaluate the simpler of the integrals in (a) and (b).
10. Find the area of the region bounded by the curves x = y2 - 1 and x = 1 - y2.
11. Find the area of the region bounded by the curves x = y2 and x = 6 - y.
12. Find the area of the region bounded by the curves y =cc3 - 2cc and y =cc2.
13. Find the area of the region bounded by the curves y =c4- 4cc2 and y =3cc3.
14. To estimate the surface area of a lake, 21 measurements of the width of the lake are
    made at points spaced 50 yards apart from one end of the lake to the other. Suppose
    the measurements are, in order, 0, 50, 100, 120, 180, 240, 300, 250, 220, 295, 305, 265,
    240, 275, 225, 180, 120, 90, 63, 40, and 0, all measured in yards. Use Simpson's rule
    to approximate the surface area of the lake.


﻿


Section 4.8


                to                     Distance, Position, and
       Differential Equations          the Length of Curves


Although we motivated the definition of the definite integral with the notion of area, there
are many applications of integration to problems unrelated to the computation of area.
Depending on the context, the definite integral of a function f from a to b could represent
the total mass of a wire, the total electric charge on such a wire, or the probability that a
light bulb will fail sometime in the time interval from a to b. In this section we will consider
three applications of definite integrals: finding the distance traveled by an object over an
interval of time if we are given its velocity as a function of time, finding the position of an
object at any time if we are given its initial position and its velocity as a function of time,
and finding the length of a curve.

Distance
Suppose the function v is continuous on the interval [a, b] and, for any a < t < b, v(t)
represents the velocity at time t of an object traveling along a line. Divide [a, b] into n
time intervals of equal length
                                      At= b-a
                                             n
with endpoints a = to < ti < t2 <   < to = b. Then, for j = 1, 2, 3, ... , n, v(tj _1) is the
speed of the object at the beginning of the jth time interval. Hence, for small enough At,
v(t _1) At will give a good approximation of the distance the object will travel during
the jth time interval. Thus if D represents the total distance the object travels from time
t =a to time t =b, then

                                 D  ~     v(ty _1)|At.                          (4.8.1)
                                      j=1
Moreover, we expect that as At decreases, or, equivalently, as n increases, this approxima-
tion should approach the exact value of D. That is, we should have


                               D =lim        v(tj_1) At.                        (4.8.2)
                                        j=1

Now the right-hand side of (4.8.1) is a Riemann sum (in particular, a left-hand rule sum)
which approximates the definite integral

                                        b
                                        L v (t) dt.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Distance, Position, and the Length of Curves


Section 4.8


                   6

                   4

                   2


                            0.2     0.4     0.6      0.8  
                  -2

                  -4

                  -6
             Figure 4.8.1 Graph of the velocity function v(t) = 4 cos


Hence this integral is the value of the limit in (4.8.2), and so we have


1


(27t)


          n2
D = limZ v(t _1)|At
         j=1


b
| v(t)| dt.


(4.8.3)


Example Suppose an object is oscillating at the end of a spring so that its velocity at
time t is given by v(t)= 4 cos(27t). Then the distance D traveled by the object from time
t = 0 to time t = 1 is given by


                     D=      |4cos(2rt)dt=4      cos(2rt)|dt.
                       /=J1 1                 1 2t Id n
                          00


Now


                        1    3
cos(2wrt) > 0 when 0 < t < - or - < t < 1
                        4    4-    -'


and


                 1       3
cos(27rt) < 0 when - < t < -


Hence


                                1   3
cos(2wrt)| cos(2wrt) when 0 < t < or  <   <1,


and


                          1       3
cos(2wrt)| - cos(2wrt) when -j<t < -.


﻿


Section 4.8


Distance, Position, and the Length of Curves


3


Thus
          11                                31
        f   cos(2rt)|dt  f   cos(2rt)|dt + f  cos(2rt)|dt + f  cos(27rtdt
                              (o4                           4

                         f4cos(2rt)dt - f4cos(2rt)dt + fcos(27t) dt
                                         4               4
                                    1              3   1
                         -2sin(27t)   -  - sin(27t) +  - sin(27t)

                         (2r - )-       2 2   7r)      +2
                         21              12     1     2w


                         2


Hence
                            D = 4    |cos (27rt)|Idt = - .
                                       O/l 8


Position
Again suppose v is continuous on [a, b] and v(t) represents, for a < t < b, the velocity at
time t of an object moving on a line. Let x(t) be the position of the object at time t and
suppose we know the value of x(a), the position of the object at the beginning of the time
interval. It follows that
                                (t)=   x(t) = v(t),                        (4.8.4)

and so, by the Fundamental Theorem of Integral Calculus, for any t between a and b,


                             f  v(s)ds = x(t) - x(a).                      (4.8.5)

Thus we have

                              x(t)     v(s)ds + x(a)                       (4.8.6)

for a < t < b. In other words, if we are given the velocity of an object for every time t in
the interval [a, b] and the position of the object at time t = a, then we may use (4.8.6) to
compute the position of the object at any time t in [a, b].
Example As in the previous example, consider an object oscillating at the end of a
spring so that its velocity is given by v(t) =4 cos(27rt). If x(t) is the position of the object
at time t and, initially, x(O) =3, then


            cc(t) ] 4 cos(27rs)ds + 3 =- sin(27rs) + 3 - sin(27rt) + 3.
                        w                               w


﻿


4


Distance, Position, and the Length of Curves


Section 4.8


6
4
2


-2

-4

-6


4 6


1       2       3        4


3


2-

1-


1


2


3


4


Figure 4.8.2 Velocity v(t)


                              2
4 cos(27rt) and position x(t) =- sin(27rt) + 3
                              wr


You should compare the graphs of the velocity function v and the position function x in
Figure 4.8.2. Note that the object will oscillate between 3 - and 3 +  . In particular,
the distance between these two extremes is y, and so the object will travel a distance of 8
during a complete oscillation, in agreement with our computation in the previous example.

Example Suppose the velocity of an object at time t is given by v(t) = 4sin(t2). If x(t)
is the position of the object at time t and its position at time 0 is x(0) = -1, then


                               x(t) =    sin(s2)ds - 1.

However, unlike the previous example, there does not exist a simple antiderivative for v;
hence, the best we can do is approximate x(t) for a specified value of t using numerical
integration. For example, we can compute numerically that


                           x(2) =    4 sin(s2)ds - 1 = 2.219,

where we have rounded the result to the third decimal place. If we do this for enough
points, we can plot the graph of x, as shown in Figure 4.8.3. Again, you should compare
this graph with the graph of v, also shown in Figure 4.8.3.


6[


4
2


-2
-4
-6


/fl\J\I\


3

2


1    2       3kIJ


1      2     3      4      5


1


Figure 4.8.3 Velocity v(t) = 4sin(t2) and position x(t) = f 4sin(s2)ds - 1


﻿


Section 4.8


Distance, Position, and the Length of Curves


5


                               x0    x1     X2   X3    x4    X5

                 Figure 4.8.4 Approximating a curve with line segments


Length of a curve
Here we will consider the problem of finding the length of a curve which is the graph of
some differentiable function. So suppose the function f is continuous on the closed interval
[a, b] and differentiable on the open interval (a, b). Let C be the graph of f and let L be
the length of C. As we have done previously, we will first describe a method for finding
good approximations to L. To begin, divide [a, b] into n intervals of equal length


                                      Az= b-a
                                              n

with endpoints a     x=0o < x < X2G      < xn = b. For j = 1, 2, 3, ... ,n, we can
approximate the length of the piece of C lying over the jth interval by the distance between
the endpoints of this piece, as shown in Figure 4.8.4. That is, since the endpoints of the
jth piece are (xo_1, f(xy_1)) and (xi, f(xz)), we can approximate the length of the piece
of C lying over the interval [xy_1, cc] by

                            (o - ox._1)2 + (f (Xgj) - f (xg_1))2.


Since Ax =zx3 - og_1,


       (g-     _1)2 + (f (X.7) - f(x _))2 =/(Ax)2 + (f(xg) - f(_))2


                                             (A)2 1 +(f(x)A    x?
                                                               ( AX)2            (4.8.7)


﻿


6


Distance, Position, and the Length of Curves


Section 4.8


Hence, when n is large (equivalently, when Ax is small), a good approximation for L is
given by


                        L         1+ (f(X).
                            j=1                           Ax.
Moreover, we expect that


(4.8.8)


L=lim  n/1+ K  Ax 2Ax,
         j=1


(4.8.9)


provided this limit exists. By the Mean Value Theorem, for each j
exists a point c3 in the interval (xy_1, xz) such that


(c     f(x ) - f(xy-)
  f'(c3)Ax


1,2,3,..., there


        (4.8.10)


        (4.8.11)


Hence
                           L    lim       1 + (f'(c ))2 Ax.
                                    j=1
Now the sum in (4.8.11) is a Riemann sum for the integral


I b
      1 + (f(x)) 2 dx,
 a


and so the limit, if it exists, converges to the value of this integral. Thus the length of C
is given by

                              L = f     1 + (f'(x))2 dx.                      (4.8.12)


Example     Let L be the length of the graph of f(x) =X on the interval [0,11, as shown
in Figure 4.8.5. Then

                                    f (x) = -2z
                                            2
so


                                        1+ (x)d
                                  8        9
                                        1+ -x d
                                 =27    +4x)     0

                                   13v 13-18
                                   -  27      -1.4397,
                                      27
where we have rounded the result to four decimal places.


﻿


Section 4.8


Distance, Position, and the Length of Curves


7


      1                                               4


    0.8
                                                      3

    0.6
                                                      2
    0.4

                                                       1
    0.2


             0.2   0.4    0.6    0.8    1                  0.5  1   1.5  2

                       Figure 4.8.5 Graphs of y = x and y = x2

Example     Let L be the length of the parabola y = x2 from (0, 0) to (2, 4), as shown in
Figure 4.8.5. Then
                                       dy - 2x,
                                       dx
so
       SO                    2'2
                      L=        1+(2x)2 dxf          1+4x2 d.

At this point we do not have the techniques to evaluate this integral exactly using the
Fundamental Theorem (although we will see such techniques in Chapter 6); however, we
may use a computer algebra system to find that
                              1   1                1
                   L      = + -sinh-1(4) = 17 + -(log(4+  17)),
                              4                    4
where log(x) is the natural logarithm of x and sinh--1(x) is the inverse hyperbolic sine of x.
Since we will not study either of these functions until Chapter 6, we will use a numerical
approximation to give us L = 4.6468 to four decimal places, the same answer we would
obtain by using numerical integration to evaluate the integral.
Example To find the length L of one arch of the curve y = sin(x), as shown in Figure
4.8.6, we need to evaluate
                               L=      21 + cos2() d,

an integral which is even more difficult than the one in the previous example. However,
using numerical integration, we find that L =3.8202 to four decimal places.
    The last two examples illustrate some of the difficulties in finding the length of a curve.
In general, the integrals involved in these problems require more sophisticated techniques
than we have available at this time, and frequently require the use of numerical techniques.


﻿


8


Distance, Position, and the Length of Curves


Section 4.8


                   0.8

                   0.6

                   0.4

                   0.2


                            0.5     1      1.5     2      2.5     3
                 Figure 4.8.6 Graph of y = sin(x) over the interval [0, r]


Problems

1. For each of the following, assume that v(t) is the velocity at time t of an object moving
    on a line and find the distance traveled by the object over the given time period.
    (a) v(t) = 32t over 0 < t < 3          (b) v(t) = -32t + 16 over 0 < t < 3
    (c) v(t) =t2 -t-6 over 0< t <2         (d) v(t) =t2 -t-6 over 0< t <4
    (e) v(t) = 2 sin(2t) over 0 <t < r      (f) v(t) = 3 cos(27t) over 0 < t < 2
 2. Suppose the velocity of a falling object is given by v(t) = -32t feet per second. If
    the object is at a height of 100 feet at time t = 0, find the height of the object at an
    arbitrary time t.
 3. Suppose x(t) and v(t) are the position and velocity, respectively, at time t of an object
    moving on a line. If x(0) = 5 and v(t) = 3t2 - 6, find x(t).
 4. If an object of mass m is connected to a spring, pulled a distance xo away from its
    equilibrium position and released, then, ignoring the effects of friction, the velocity of
    the object at time t will be given by

                                           kk
                              v(t)= -xo     -sin     -t,
                                            mm

    where k is a constant that depends on the strength of the spring. Find x(t), the
    position of the object at time t.
 5. Show that if Wi(t) = f(t) and f is continuous on [a, b], then


                                 cc(t) =   f (s)ds + cc(a).


﻿


Section 4.8


Distance, Position, and the Length of Curves


9


6. For each of the following, use the result from Problem 5 to find x(t).
    (a) c(t) = 3t3 + 6t - 17 with x(2) = 4  (b) Wi(t) = 3 cos(6t) - t with x(0) = -1
    (c) c(t) = sin2(2t) with x(0) = 2      (d) c(t) = 3t2 sin(2t) with x(0) = 0
    (e) Wi(t) = /1 + 2t with x(4) = 3
 7. Let x(t), v(t), and a(t) be the height, velocity, and acceleration, respectively, at time t
    of an object of mass m in free fall near the surface of the earth. Let zo and vo be the
    height and velocity, respectively, of the object at time t = 0. If we ignore the effects
    of air resistance, the force acting on the body is -mg, where g is a constant (g = 9.8
    meters per second, or 32 feet per second per second). Thus, by Newton's second law
    of motion,
                                      -mg =ma(t),

    from which we obtain
                                        a(t) = -g.

    Using Problem 5, show that

                                          12
                                 x(t)     2gt2 + vot + zoo.


 8. Suppose an object is projected vertically upward from a height of 100 feet with an
    initial velocity of 20 feet per second. Use Problem 7 to answer the following questions.
    (a) Find x(t), the height of the object at time t.
    (b) At what time does the object reach its maximum height?
    (c) What is the maximum height reached by the object?
    (d) At what time will the object strike the ground?
 9. For each of the following, find the length of the graph of the given function over the
    given interval.

    (a) f(x) = 2x 1 over [0, 2]                (b) f(x) = sin(2x) over [0, 2 ]

    (c) g(x) = x3 over [-1, 1]                 (d) g(t) = tan(t) over [-i, -
    (e) f(t) =1sin2(t) over [0,r]           (f) g(O) = sin(02) over [0, \/7]
10. A sheet of corrugated aluminum is to be made from a flat sheet of aluminum. Suppose
    a cross section of the corrugated sheet, when measured in inches, is in the shape of the
    curve
                                      y =2sin (t).

    Find the length of a flat sheet that would be needed to make a corrugated sheet that
    is 10 feet long.


﻿


Section 5.1


       Differential Equations      Polynomial Approximations


In Chapter 3 we discussed the problem of finding the affine function which best approx-
imates a given function about some point. In particular, we found that the best affine
approximation to a function f at a point c is given by


T(x) = f'(c)(x - c) + f(c),


(5.1.1)


provided that f is differentiable at c. In this section and the next, we will extend the
ideas of Sections 3.1 and 3.2 to the problem of finding polynomial approximations of any
given degree to a function about some specified point. We shall see that many nonlinear
functions can be approximated to any desired level of accuracy over a specified interval
if we use polynomials of sufficiently high degree. As an example, compare the graphs of
f (x) = sin(x) and


            13 1
P(x) = x - -x3 +     x5
            6     120


    17         1
-   1O47 + 328 9
  5040  362, 880


in Figure 5.1.1. They are almost indistinguishable over the interval [-7, 7]. In practical
terms, this means there is little difference in working with P(x) instead of f(x) for x
in [-7, 7]. Moreover, since polynomials are the simplest of functions, involving only the
arithmetic operations of addition, subtraction, and multiplication, the substitution of P
for f can be a very helpful step in simplifying a problem.


3

2


7.5


Figure 5.1.1 Graphs of f(x)


sin(x) and an approximating polynomial


1


Copyright @ by Dan Sloughter 2000


﻿


2


Polynomial Approximations


Section 5.1


    To begin, we need to recall, and then generalize, some definitions and facts from
Sections 3.1 and 3.2. First, recall that a function f is said to be o(h) if


                                      lim f(h) = 0;                                (5.1.2)
                                      h-O h

a function f is said to be O(h) if there exist constants M and c such that

                                        f(h) <M                                    (5.1.3)
                                        h

whenever -E < h < E. In particular, we saw that f is O(h) if


                                         lim f(h)
                                         h-O h

exists. The following definition generalizes to other powers of h this method of character-
izing the rate at which a function converges to 0.

Definition For any n> 0, a function f is said to be o(hh) if


                                      lim f(h)= 0.                                 (5.1.4)
                                      h-0 hh

For any n> 0, a function f is said to be O(h") if there exist constants M and c such that

                                        f(h) <M                                    (5.1.5)
                                        hn

whenever -E < h < E.

    Similar to our result in Section 3.1, f is O(h) if


                                         lim f(h)
                                         h-0 h?'

exists.
    As before, we use this notation as a means of comparing the rates at which functions
approach 0. As h approaches 0, a function which is 0(h) approaches 0 as least as fast as
hm does. Note that for nt> m > 0,


                                 lim     =lim kh--m =0                             (5.1.6)

since nt - mnn> 0, and so hm goes to 0 faster than hm as h approaches 0. Thus if nt > inn> 0,
as h goes to 0, a function which is 0(hm) approaches 0 faster than does a function which


﻿


Section 5.1


Polynomial Approximations


3


n=6


1        -0.5                   0.5         1


Figure 5.1.2 Graphs of f(h)


h" for n = 2, 4, 6, and 8


is O(hm) but not O(h"). Figure 5.1.2 illustrates this fact with the graphs of f(h)
n=2,4,6 and 8.


h" for


Example


Since


lrn sin(h)
h->O   h


it follows that sin(h) is O(h).


Example


Since


lim sin2(h)
h->O   h


lim sin(h)
h->O  h


lim sin(h)
h->O


(1)(0) = 0,


it follows that sin2(h) is o(h).


Example


Since


lim sin2(h)
h->0  h2


lim sin(h)
h->O h


lim sin(h)
h->O  h


(1)(1) = 1,


it follows that sin2(h) is 0(h2).


Example


Since


lim sin3(h)
h->0 h3


    sin(h)
lim
h-->0 h


lim sin(h)
h->O   h


    sin(h)
lim
h->0 h


(1)(1)(1) = 1,


it follows that sin3(h) is O(h3).

    Hence, for example, we would say that as h goes to 0, sin2 (h) approaches 0 faster than
h, but at about the same rate as h2.


﻿


4


Polynomial Approximations


Section 5.1


    Now suppose that f is O(h") for some n > 0. This means that as h goes to 0, f
approaches 0 at least as fast as hh does. It should follow that f goes to 0 faster than hm,
and so is o(hm), for any 0 < m < n. To see this, let M and c> 0 be numbers such that

                                       f(h) <M                                    (5.1.7)
                                       hn

for all h in the interval (-E, c). Then

                           f(h) =lhn-m      f(h) < hn--mM                         (5.1.8)


for all h in (-E, c). Since
                                   lim hhn-mM = 0,                               (5.1.9)
                                   h-0O
it follows that
                                    lim  f(h) = 0,                              (5.1.10)
                                    h-0 hm
and so
                                     lim f(h)= 0.                               (5.1.11)
                                     h-0 hm      0
Thus f is o(hm).
Proposition If n> m > 0 and f is O(h), then f is o(hm).
Example We saw above that sin3(h) is O(h3), from which it now follows, for example,
that sin3(h) is o(h2).
    Next, recall that if f is a function defined in an open interval about a point c and T
is an affine function such that T(c) = f(c) and

                              R(h) = f (c + h) - T(c + h)                       (5.1.12)

is o(h), then T is the best affine approximation to f at c. Moreover, as mentioned above,
we saw in Chapter 3 that a function f has a best affine approximation at a point c if and
only if f is differentiable at c and, in that case, the best affine approximation is given by

                               T(x) = f (c) + f'(c)(x - c).                     (5.1.13)

Putting (5.1.12) and (5.1.13) together and letting x =c + h, or, equivalently, h =x - c
we have that
                               f(x) - f(c) - f'(c)(x - c)

is o(x - c). We may express this by writing


f (x) - f (c) - f'(c)(x - c) = o(x - C),1


(5.1.14)


﻿


Section 5.1


Polynomial Approximations


5


or, solving for f(x), simply

                          f () = f (c) +f'(c)(x - c) + o(x - c).                (5.1.15)

In words, (5.1.15) says that f(x) is equal to f(c) + f'(c)(x - c) plus some function which
is o(x - c), that is, some function which approaches 0 faster than x - c as x approaches c.
Example     Let f(x) =c. Then
                                               1
                                               2'
so the best affine approximation to f at 1 is

                                             1
                                 T(x) 1 1+ -(x - 1).
                                             2
That is,
                                   1+ -(X - 1) + o(x - 1).
                                       2
In words, this statement says that cc is equal to

                                          1
                                      1+ -(X - 1)
                                          2
plus a term of order higher than x - 1, that is, plus a term which goes to 0 faster than
x - 1 as x approaches 1.
Example Let f (x) = sin(x). Then f'(0) = cos(0) = 1, so the best affine approximation
to f at 0 is
                                       T(x) = x.
Thus
                                   sin(x) = x + o(x),
a fact which is often used in applications to justify replacing the function sin(x) by the
function x for calculations involving only values of x close to 0.
    Now suppose that f is twice continuously differentiable on an interval (c - 8, c + b) for
some 8 > 0; that is, suppose both f' and f" exist and are continuous on (c - 8, c + b). If
T is the best affine approximation to f at c, then, as we have seen,

                              R(h) = f (c + h) - T(c + h)                       (5.1.16)

is o(h). We will show that R is in fact 0(h2). Suppose 0 < c <   and -e < h < c. First,
note that


By the Mean Value Theorem, there is a point 11 between c and c + h such that


f (c + h) - f (c) = f'(u)h.


(5.1.18)


﻿


6


Polynomial Approximations


Section 5.1


Hence
               f (c + h) - T(c + h) = f'(u)h - f'(c)h = h(f'(u) - f'(c)).  (5.1.19)

Applying the Mean Value Theorem again, there exists a point v between c and u such that

                            f'(u) - f'(c) = f"(v)(u - c).                  (5.1.20)

Thus
         R(h) _ f(c + h) - T(c + h) _ h(u - c)f"(v) _ ( - c)f"(v)         (5121)
         h2     -h2h2h                                           .(5.1.21)
If we let M be the maximum value of f"(x) on [c - E, c + E] and note that lu - cl < |hl,
then we see that
                        R(h)     lu - cllf"(v)| lhlM
                                              <       =_M                 (5.1.22)
                             h|hl                |hl

for all h with -E < h < E. Hence R(h) is O(h2).

Proposition    If f is twice continuously differentiable on an open interval containing the
point c and T is the best affine approximation to f at c, then

                            R(h) = f(c + h) - T(c + h)                    (5.1.23)

is O(h2).

    Letting x = c+ h, we can rephrase the proposition to say that

                                r(x) = f(x) - T(x)                        (5.1.24)

is O((x - c)2). Similar to our notation above, we may write

                      f (x) = f (c) + f'(c)(x - c) + O((x - c)2).         (5.1.25)

For our previous examples, this means that

                                   1
                                   2

and
                                sin(x) =x + 0(x2).

This is the type of formulation that we wish to generalize to higher order polynomial
approximations. We will introduce these polynomials, called Taylor polynomials, here,
but save the verification that they provide the sought-for approximations until the next
section.


﻿


Section 5.1


Polynomial Approximations


7


Taylor polynomials
The best affine approximation T to a function f at a point c may be described as the only
first degree polynomial satisfying both T(c) = f(c) and T'(c) = f'(c). This provides a clue
as to where to look for higher order polynomial approximations. Namely, given a function
f which is n times differentiable at a point c, we will look for a polynomial Pn of degree
at most n with the property that P (c) = f(c) and the first n derivatives of Pn at c agree
with the first n derivatives of f at c. Hence we want to find constants bo, bi, b2,... , bn so
that the polynomial


P (x) = bo + bi(x - c) + b2(x - c)2 +- + b(x - c))


                  Pn((C) = f () C


(5.1.26)


(5.1.27)


satisfies


for j = 0, 1, 2, ..., n, where P?) = P and, for j > 0, Pin is the jth derivative of Pn. Now


Pn (c)
P' (c)
P"(c)
P' (c)
P(4) (c)


b1
2b2
(3)(2)b3
(4)(3)(2)b4


                                 Pl")(c) = n!bn.

Thus, to satisfy (5.1.27), we must have


f (C) =
f'(c)
f"(c)
f"'(c)
f (4) (c)


bo
-b1
2b2
= 3!b3
= 4!b4


(n) (C) = n!bn -


Solving for bo, b1, b2 .... , bn, we have


bo = f(c)
bi = f'(c)
    _f"(c)
b2 =   2
       2


﻿


8


Polynomial Approximations


Section 5.1


                                        f"'(c)
                                        b3=
                                             3!
                                        4 f(4)(c)
                                             4!


                                        _ f(12)(c)
                                     bn  =-    ( .
                                             n!

That is,

                                      by j=!.,                                  (5.1.28)


for j = 0,1, 2, ... ,n. The resulting polynomial is named after Brook Taylor (1685-1731),
an English mathematician who was the first to publish work on the related infinite series
that we will consider later in this chapter.

Definition Suppose f is n times differentiable at a point c. Then the polynomial


  Pn(x) = f (c) + f'(c)(x - c) + f"(c) (X- c)2 + f(c)  f-C)3+-.(+c) (X - c)
                                  2               3!                     n!

is called the Taylor polynomial of order n for f at c.

Example     Consider f(x) = sin(x) and c = 0. Then


                                   f'(x) = cos(x)
                                   f"(x)    -sin(x)
                                   f"'(X) =- cos(X)
                                   f""(x) =sin(x).


Notice that, if we were to continue finding higher derivatives, this cycle would repeat itself.
Evaluating the function and its derivatives at 0, we obtain

                                      f(0) = 0
                                      f'(0) =1
                                      f"(0) = 0
                                      f"'(0) =--1


a cycle which would repeat itself if we were to continue evaluating higher-order derivatives.


﻿


Section 5.1


Polynomial Approximations


9


3
2
1


7.5


7.5


2
3


Figure 5.1.3 Graphs of f(x)


sin(x) with Taylor polynomials P1 (left) and P3 (right)


Thus we obtain the following Taylor polynomials for sin(x) at x


0:


P1 (x)
P2(X)

P3(X)

P4(X)


X
X
    x3
    3!

    x3
X -

    3!


    3!
    x3
  X 3!
    x3
  X 3!
    x3
  X 3!


P6(x)

P7 (X)

P8()


P9(x)


  x5
  5!
  x5
  5!
  x5
+5!
  x5
  5!
  x5
  5!


7!

7!

7!


  x9
+ 9


The graphs of P1, P3, P5, and P7, along with the graph of f, are shown in Figures
5.1.3 and 5.1.4. We have already seen the graph of P in Figure 5.1.1. Notice how the
Taylor polynomials give increasingly better approximations to sin(x) as the order increases.
Finally, since the values of the derivatives repeat the pattern 0, 1, 0, endlessly, in this case
we can write down a simple general expression for the Taylor polynomial of any order.
Namely, for any integer n > 0,


                3     5    7                2n+1
P2n+1() =  x -         +          + (-1)(2n1)!


and P2n+2(x)
tives are 0.


P2n+1(x), the latter following from the fact that all the even-order deriva-


﻿


10


Polynomial Approximations


Section 5.1


3
2
1


3
2
1


7.5


7.5


2
3


  Figure 5.1.4 Graphs of f(x)


Example Now we will find
First we find that


sin(x) with Taylor polynomials P5 (left) and P7 (right)


e Taylor polynomial of order 4 for g(x) x  at x =


tih


1.


         1 1
 g'(x) = - - x

 g"(x) =  -
        3 5
g"'(x) =   x


          16


2


-7


Hence


  g(1) = 1
        1
 g'(1) = -
        2


        3
gill (1) =-
          15
g(1) =    16


Thus we have


            1
P4(x) = 1 + -(x
            2
            1
      = 1 + -(x
            2


1

2(


      3
1)2 + 3(x


        15
- 1)3_1 (x-1)4
        4!

- 1)3 - 5(x - 1)4.
        128


       1          1
- 1) - -(x - 1)2 +  (x
       8          16


The graphs of P4 and g are shown in Figure 5.1.5. As we hoped, P4(x) provides a good
approximation to  x for values of x close to 1. For example, to 8 decimal places,

                              P4(1.1) = 1.04880859,

while
                                 1.1 = 1.04880884.


﻿


Section 5.1


Polynomial Approximations


11


2

1.5

1

0.5


0.5

-1


1        2


3


5


          Figure 5.1.5 Graphs of g(x) =


As we should expect, the approximation
to 3 decimal places,


  x with Taylor polynomial of order 4


worsens for x farther away from 1. For example,


while


P4(2) = 1.398,


   2= 1.414.


    Although finding a Taylor polynomial of order n for a given function f involves only
evaluating the derivatives of f at a specified point, nevertheless, the required computations
may become unwieldy, especially if f is itself complicated or n is large. In such cases, a
computer algebra system may prove useful. For example, you may find a computer algebra
system helpful in working Problems 10 through 12.
    In the next section we will see that the Taylor polynomials provide polynomial approx-
imations that generalize best affine approximations. That is, we shall show that, under
suitable conditions, if Pn is the Taylor polynomial of order n for f at c, then the remainder
function


R(h) = f (c + h) - Pn (c + h)


(5.1.29)


is O(hn+1), in agreement with our previous result that the remainder function for the best
affine approximation, P1, is 0(h2).

Problems


1. Show that f(x)

2. Show that g(x)

3. Show that f(z)

4. Show that h(t)

5. Show that f(x)


tan2(x) is o(h).

tan2(x) is 0(h2).

z2 sin(z) is o(h2) and O(h3).

1 - cos(t) is 0(h2).

sin2(3x) is 0(h2).


﻿


12


Polynomial Approximations


Section 5.1


6. Show that f(x) = x3 is o(h), but not O(h2).
7. For each of the following functions, find the Taylor polynomial of order 4 at the given
    point c.
    (a) f (x) = sin(2x) at c = 0 (b) g(x) = cos(x) at c = 0
    (c) f (z) = V/sz at c = 4  (d) f (9) = tan(g) at 8 = 0
    (e) h(x) = sin(x) at c =w7  (f) g(t) = cos(2t) at c =w7
               1
    (g) f (x) = -at c = 1      (h) f(x) = 3x2 +2x--9at c = 0
                 1
    (i) g(x) = 1 2at c = 0   (j) h(x) = x4 + 5x3 + 4x2 - 9x - 20 at c = 0
              1
    (k) g(t)  t2 at c=1         (1) h(z) = 8z5 - 3z3+ 6z at c = 0
    (m) x(t) =sec(t) at c = 0  (n) f(x) = 3x4 - 4x3 + x2 - 3x - 2 at c= 1
 8. Let P9 be the 9th order Taylor polynomial for f(x) = cos(x) at 0. Graph f and P9 on
    the same axes. On what interval does P9(x) appear to give a good approximation to
    cos(x)?
 9. Let P13 be the 13th order Taylor polynomial for f(x) = sin(x) at 0. Graph f and P13
    on the same axes. On what interval does P13 (x) appear to give a good approximation
    to sin(x)?
                                                         1
10. Let Pn be the nth order Taylor polynomial for f(x) =  at 0.
                                                       cc2 + 1
    (a) Graph f and P6 on the same axes. On what interval does P6((x) appear to give a
       good approximation to f(x)?
    (b) Repeat part (a) for Pio and P20.
    (c) Do any of the polynomials in parts (a) and (b) appear to give a good approximation
       to f on the interval [1, 2]?
11. Let P6 be the 6th order Taylor polynomial for x(t) = tan(t) at 0. Graph x and P6 on
    the same axes and comment.
12. Let P15 be the 15th order Taylor polynomial for g(x) cV at 1. Graph g and P15 on
    the same axes. On what interval does P15(x) appear to give a good approximation to
    /z?


﻿


        Difference Equations         Section 5.2
               to
       Differential Equations        Taylor's Theorem


The goal of this section is to prove that if Pn is the nth order Taylor polynomial for a
function f at a point c, then, under suitable conditions, the remainder function

                            Rn (h) = f (c + h) - Pn (c + h)                  (5.2.1)

is O(hn+1). This result is a consequence of Taylor's theorem, which we now state and
prove.
Taylor's Theorem Suppose f is continuous on the closed interval [a, b] and has n + 1
continuous derivatives on the open interval (a, b). If x and c are points in (a, b), then


    f (x) = f (c) + f'(c)(x - c) +  2! (x - c)2 + ... + f (x - c)n + rn (x), (5.2.2)

where
                           rn(()      x (x - t)n f(n+1) (t)dt.               (5.2.3)

That is, if Pn is the nth order Taylor polynomial for f at some point c in (a, b) and x is
any point in (a, b), then
                               f (x) = Pn (x) + rn (x),                      (5.2.4)
where rn is given by (5.2.3).
    We will show that Taylor's theorem follows from the Fundamental Theorem of Integral
Calculus combined with repeated applications of integration by parts. Let f be a func-
tion satisfying the conditions of the theorem. Since f is an antiderivative of f', by the
Fundamental Theorem of Integral Calculus we have

                              f(x) - f (c) =   f'(t)dt.                      (5.2.5)

Hence
                              f(x) = f (c) +   f'(t)dt,                      (5.2.6)

which is the statement of Taylor's theorem when n = 0. For n = 1, we perform an
integration by parts on the integral in (5.2.6) using

                              u = f'(t)     dv = dt
                              du = f" (t) v =-(z -t).


1


Copyright @ by Dan Sloughter 2000


﻿


2


Taylor's Theorem


Section 5.2


Note that this is not the most obvious choice for v (certainly, v= t would be a simpler
choice), but it is a valid choice and one that leads to the result we desire. Namely, this
gives us


f(x)=


f(c) - f'(t)(X - t) +     (x - t)f"(t)dt

f(c) + f'(c)(x - c) + fr(x - t)f"(t)dt,


which is the statement of Taylor's
another integration by parts using


theorem for the case n


1. For n = 2, we perform


U= f"(t) do = (x -t)dt
du = f"' (t ) v =x  2


from which we obtain


f (  = f (c) + f'(c)(x

     = f (c) + f'(c)(x


  c)f"(t)(cc-t)2 X fX (cc-t)
          2          c       2    f"'(t)dt
     ) +f"(c)   )        ( -t
c) +   2(X - c)2 +    2 f"'(t)dt.
       2              e     2


Similarly, we obtain Taylor's theorem for n = 3 by another integration by parts. This time
we have


U = f"'(t) dv

du = f(4)(t)   v


  2 dt

(x - t)3
   3!


which yields


f (  = f (c) + f'(c)(x

     = f (c) + f'(c)(x


     f"(c) (c
c) + f(C) (X

c) +   2 / ((X


f"'(t)(x - t)3 + X (X -t)3f(4)(tdt
      3!J                3fJ(d


f"'(c)
  3!  (cc


c)3 + f   (-t)3 f(4)(t)dt.


From this we can see that, for any nonnegative integer n, performing integration by parts
n times will yield


     f+f"(c)(
f(cc) - f (c) + f'(c)(cc - c) + 2! (cc


c)2 + f"'(c)
        3!  (cc


c)3 + -


f (i!(cn)
  n.! x


(5.2.7)


c)" +     (cc !   f(n+l) (t)dt,


which is the general statement of Taylor's theorem.
    In applying Taylor's theorem, it is seldom the case that the remainder term r (x) can
be evaluated exactly. In most cases, we try to find an upper bound for |rn(x)| so that


﻿


Section 5.2                           Taylor's Theorem                          3

we know what is the worst possible error that we could commit in approximating f(x)
by P (x). For these purposes, there is an alternative formulation of the remainder term
which is often more useful than the one given in Taylor's theorem.
Lagrange's form of the remainder term Using the same notation as in the statement
of Taylor's theorem, there exists a number k between c and x such that

                                    f (n+1) (k)(x - c)n+1
                                         (n+1)

    To show this, we will assume x > c, the argument in the case xc <c being similar. So
let u be the point where f(n+1) attains its maximum value on [c, x] and let v be the point
where f(n+l) attains its minimum value on [c, c]. Note that we know such points exist
because we have assumed f(n+1) to be a continuous function on (a, b), and hence on [c, c].
Then we have

              (x - t) f (n+1) (v) < (x - t)n f(1+1) (t) < (x - t) f (n+l) (u)  (5.2.9)
                 n!                 n!                 n!

for all t in [c, c]. Integrating each of the terms in (5.2.9) from c to x, we have


f (n+l)(v) J x(X - t)" dt < fx (X - t) f (n+l) (t)dt < f (n+l)u JX(X -!t)n dt. (5.2.10)

Now
                    /L(cc - t)hd (cc- t)Th+1 X (cc-c)+
                                    (dtl)!(rt-)!(5.2.11)
                    c (Xn!t     d      (n ~+ 1x      ((n + 1)! ll(..1
and
                          JX (X -t) f(n+1)(t)dt =1r(X),                    (5.2.12)


so (5.2.10) implies that

                 f(nl)(v)(x - c)+lf )r-) < f (n+±)(u)(X - c +1
                          (rt<1)!                   (1)<         .         (5.2.13)
                       (n + 1)!      -              (n + 1)!

Finally, since

                                fg2±)t)(c =       hl(5.2.14)
                                         (n + 1)!
is a continuous function of t on the interval [c, cc], it follows from (5.2.13) and the Inter-
mediate Value Theorem that there exists a number k in [c, cc] such that g(k) =rn(cc), that
is,
                                    f (n+±l (k)(cc -cn+


which is (5.2.8).


﻿


4


Taylor's Theorem


Section 5.2


    Of course, we cannot calculate rn(x) exactly without knowing the value of k. However,
if we can find a number M such that

                                    |f(n+±)(t)| <M                             (5.2.16)


for all t between c and x, then (5.2.8) implies that


                              r (M)|<I - cn+1.                                 (5.2.17)
                                       (n + 1)!

Hence, although usually we cannot hope to know the exact amount of error in our approx-
imation, in this case we can at least find an upper bound for the size of the error.
    We can now show that rn(x) is O((x - c)n+l), or, equivalently, that

                             R12(h) = f (c + h) - P2(c + h)                    (5.2.18)

is O(hn+l). First choose c > 0 so that the interval I = [c - E, c + E] is contained in the
interval (a, b), and let M be the maximum value of |f(n+l) on I. Then, using (5.2.17), we
have

                              |rn2(x)| <       IM X-c n+1                      (5.2.19)
                                       (n + 1)!

for all x in I. Thus if |hl <E,

                                           M
                               |R1(h)| <         lhhn+l,                       (5.2.20)
                                         (n + 1)!

from which it follows that
                                  R12(h)       M
                                          R <(     !.(5.2.21)

That is, Rn(h) is O(hn+l).

Proposition    If f satisfies the conditions of Taylor's theorem and P is the nth order
Taylor polynomial for f at c, then

                              R12(h) =f (C + h) - P12(h)                       (5.2.22)

is O(hnh+1).

    Of course, from our previous work we know that this statement implies that R12(h) is
also o(h"h).


﻿


Section 5.2


Taylor's Theorem


5


    With this proposition we may write

        f (c + h) = f (c) + f'(c)h + (c) h2 + ... + f((c) hh + O(hn+l),  (5.2.23)
                                    2!               n!
or, in terms of x = c + h,


f(x) =f(c)+f'(c)(x-c)+ f"(c) (x-c)2    ...+f f(n)(c) (x - c)+O((X - cQ)n+±1), (5.2.24)
                              2!                  n!
as long as f is n + 1 times continuously differentiable on an open interval containing c.
Example Recall that the fifth order Taylor polynomial for sin(x) at 0 is

                                             .T3    5
                                 P5(x) = X - + .
                                              3!   5!
Since in this case P5 = P6, we now know that

                                         .T3    5
                            sin(x)=x--      +     + O()
                                          3!   5!
More explicitly, since
                                 d7
                                     sin(x)   - cos(x),

we have
                                     d7
                                     ds7 in(x) < 1
for any value of x. Hence if
                                r5 (X)= sin(x) - Ps5(x),
then by (5.2.17) it follows that
                                               IX7

for any value of x. For example,

                                          1      1
                        | sin(1) - P5(1)| -            0.000198,
                                          7!   5040
to 6 decimal places. That is, the error in approximating sin(1) by P5 (1) is no more than
0.000198. In this case the error bound is very close to the actual error, for, to six decimal
place accuracy,
                                   sin(1) =0.841471
and
                                        1    1
                           P5 (1) =1 - - +       =0.841667,
                                        6 120
which gives an error of
                              sin(1) - P5 (1)|  0.000196.


﻿


6


Taylor's Theorem


Section 5.2


    Note that, in general, for any nonnegative integer n,
                                d2n+3
                                dx2n+3 sin(x) = cos(x)|.

                                    d2n+3
                                    dx2n+3 sin(x) < 1
for any value of x. Hence, using the fact that in this case P2n+1 =Pn+2,

                                                   X|2n+3
                            sin(x)-P2n+1(x)| <                                 (5.2.25)
                                                 (2n + 3)!

for any value of x. With this inequality we can determine the order necessary for a Taylor
polynomial to give some desired level of accuracy for a particular approximation. For
example, if we wish to estimate sin(1.4) with an error of less than 0.001 using a Taylor
polynomial about 0, then (5.2.25) says we need only find a nonnegative integer n such that

                                   1.42n+3
                                   (2n +3)! 0001,

in which case P2n+1(1.4) will provide the desired approximation. For n = 0, 1, 2, and 3,
we have the following table:
                                            1.42n+3
                                n
                                           (2n + 3)!
                                0          0.4573333
                                1          0.0448187
                                2          0.0020915
                                3          0.0000569
Hence the smallest value of n that will work is n = 3; thus to attain the desired level of
accuracy we would use
                                          .T3    5    7
                              Py(x)=X-       +          .-
                                 P7(x~x-3!      5!   7!
Checking this to 7 decimal places, we find that

                                 P7(1.4) = 0.9853938

and
                                 sin(1.4) = 0.9854497
an error of only 0.0000559.

Example Combining our work from Section 5.1 about the Taylor polynomial of order 4
for f(x) =    at 1 with our new results, we now have


                 2          8           16           128


﻿


Section 5.2


Taylor's Theorem


7


If, for example, we wanted to bound the error involved in using P4(1.6) as an estimate for
  1.6, we would first note that
                                             -105
                                    f(5) (X) =105
                                              32x2'
which is a decreasing function and hence is maximized on the interval [1, 1.6] at x = 1.
Thus
                                     If (5)(X)  10--
                                                32
for all x in [1, 1.6]. From (5.2.16) it follows that

                                         105
                      1|V.5 - P4(1.6)| < 321.6 - 1l = 0.0021263,

to 7 decimal places. Checking this on a calculator to 7 decimal places, we have

                                     1.6 =1.2649111

and
                                  P4(1.6) = 1.2634375,

showing that the error in this case is 0.0014736.
    If f is indefinitely differentiable on an interval about a point c and Pn is the Taylor
polynomial of order n for f at c, then it is frequently the case that P1, P2, P3,... is a
sequence of increasingly accurate approximating polynomials for f on some interval I
containing the point c. Of course, unless f is itself a polynomial, there is no polynomial
Pn in this sequence such that f(x) = Pn(x) for all x in I. Nevertheless, there are many
functions for which
                                   f (x) = lim P (cc)                            (5.2.26)
                                          n-o
for all x in some interval I. Such functions are said to be analytic. Since a polynomial is
just the sum of a finite number of monomials, lim Pn(x) may be regarded as an infinite
sum of monomials, an infinite polynomial. That is, if f is an analytic function, then

     f (x) = lim P (c)

                -fc)f'c(c-c+f"(c)             2f (1)(c)
          = f (c) + f'c)(x - c) +  )2! (x - c)2 + - - - +fn! (X-c)nh+...  (5.2.27)
               00f(12)(c)

                   nt! (cc- c)"h.
for all cc in some interval I containing c. An infinite series of this type is called a power
series. Since power series have many of the nice properties of polynomials, such as being
easy to integrate, a representation of a function f by a power series in this manner can be
extremely useful. Although we considered infinite series in Section 1.3, we will need a far


﻿


8


Taylor's Theorem


Section 5.2


more thorough discussion of them before we will be able fully to understand a series like
(5.2.27). We will do this in Sections 5.3 through 5.6.

Problems

1. For each of the following functions f, find the Taylor polynomial of order 5 at the point
    c, use it to approximate f (a), and find an upper bound for the absolute value of the
    error in the approximation.
    (a) f (x) = sin(x), c = 0, a = 0.8     (b) f (x) = cos(x), c = 0, a = 1.2
    (c) f (x) = sin(2x), c = 0, a = -0.5   (d) f (x) =x5, c = 1, a = 1.5
    (e) f (x) = 5'x, c = 9, a = 10          (f) f (x) = x 2, c = 1, a = 1.4
               1
    (g) f (x) = -, c = 1, a = 0.8             (h) f(x) = sin(x), c = 0, a = -1.2
 2. Use a Taylor polynomial to approximate sin(0.6) with an error of less than 0.0001.
 3. Use a Taylor polynomial to approximate sin(-1.3) with an error of less than 0.00001.
 4. Use a Taylor polynomial to approximate cos(1.2) with an error of less than 0.0001.
 5. Find the Taylor polynomial of smallest order that will approximate sin(x) with an
    error of less than 0.0005 for all x in [-2, 2].
 6. Suppose L is a function defined on (0, oc) with L(1) = 0 and

                                        L'(x) - 1.
                                                X

    (a) Find P10, the Taylor polynomial of order 10 for L at 1.
    (b) Use P10 to approximate L(1.5). Find an upper bound for the absolute value of the
       error of this approximation.
    (c) Find the Taylor polynomial of smallest degree that will approximate L(x) with an
       error less than 0.0005 for all x in [1, 1.5].
 7. Suppose E is a function defined on (-oc, oc) with E(0) = 1 and E'(x) = E(x) for all
    X.
    (a) Find P10, the Taylor polynomial of order 10 for E at 0.
    (b) Use Pio to approximate E(1).
    (c) Given that |E(x)| <3x for all x > 0, find an upper bound for the absolute value
       of the error in the approximation in part (b).
    (d) Find the Taylor polynomial of smallest degree that will approximate E(x) with an
       error less than 0.0001 for all x in [0, 2].
 8. Let P9 be the 9th order Taylor polynomial for f(x) =sin(x) at 0. Use P9 to approxi-
    mate
                                      f3 sin(x) dz.


﻿


Section 5.2                             Taylor's Theorem                              9

9. (a) Find the 6th order Taylor polynomial for f(x) = sin(x2) at 0. How is it related to
        the 3rd order Taylor polynomial for g(x) = sin(x) at 0?
    (b) Find the 7th order Taylor polynomial at 0 for


                                      h(x)       sin(t2)dt.

        How is it related to your answer in (a)?


﻿


Section 5.3


       Differential Equations         Infinite Series Revisited


Recall from Section 1.3 that for a given sequence {an}, the sequence {sn} with nth term

                             sn = ai + a2+ a3+ ...+ an                        (5.3.1)

is called an infinite series. An individual term sn is called a partial sum and we say the
series is convergent, or has a sum, if lim sn exists. If the series is not convergent, we say
it is divergent. We write
                                                           00
                    lim sn=al+ a2+a3+...+an+...            San.               (5.3.2)
                    n-- -*0
                                                          n=1

Example     In Section 1.3 we saw that if an = rn, n = 0, 1, 2, ..., then the associated
infinite series, called a geometric series, is convergent if and only if -1 < r < 1, in which
case

                                      r" =    1                               (5.3.3)
                                            1 -r
                                   n=0
For example,
                                      1     1     3
                                     3n1_-1 -2'
                                n=0           3

    Geometric series comprise one of the few classes of series for which we can evaluate sums
exactly. For most series we can only approximate the sum by computing the partial sums
sn for sufficiently large values of n. However, before this procedure becomes meaningful, we
must first know that the series converges. Hence, in this section, as well as in Sections 5.4,
5.5, and 5.6, one of our primary goals will be the development of methods for determining
whether a given series converges or diverges.
    We begin by considering several basic properties of infinite series. First, suppose we
know that both j:i' an and j'i bn are convergent series with

                                      oo
                                      5an = L
                                      n=1

and
                                     00
                                     Ybn = M.
                                     n=1


1


Copyright Q by Dan Sloughter 2000


﻿


2


Infinite Series Revisited


Section 5.3


If sn is the nth partial sum of EL°21 an, to is the nth partial sum of EO inb, and un is
the nth partial sum of E°°L1(an + bin), then un = sn + to. Thus


lim un = lim (sm + tm)
n-oo       n-oo


lim sn + lim n = L + M.
n-oo      nm-o0


(5.3.4)


That is,


   (an + bin)
n=1


00
    an
n=1


00

in=1


(5.3.5)


Proposition     If En" an and E    bm
and

                                (an + bin)
                            n=1
Similarly,E°1 (an - bn) converges and


both converge, then E°° (an + bn) converges


00
oc a
    an
n=1


   00
+Lb.
  n=1


(5.3.6)


   (a
n=1


       00
bn) =     a
      n=1


00
    bn.
n=1


(5.3.7)


Example From our results above, it follows that


°°    1     1
      3n   5n
n=0


00   1
    3n
n=0


5nO


  1       1
1 - +  1 -
  1 3   1 5


3 5
2 4


11
4


    Now suppose E0      an is a convergent series,

                                       oc
                                          an = L,
                                      n=1

and k is any constant. If sm is the nth partial sum of E  am and to is the nth partial
sum of the series E°1Ikan, then tm = ksm. Hence


lim tm = lim ksm = k lim snm= kL.
n-oo      n- oo        n-oo


That is,


(5.3.8)


(5.3.9)


00
    kan
n=1


  oc
k    an.
  n=1


Proposition  If En=1 am converges and k is any constant, then  l 00 am converges and


00
    kan
n=1


  oc
k    an.
  n=1


(5.3.10)


﻿


Section 5.3


Infinite Series Revisited


3


Example We have


Z   10
    2n
n=1


10 0     1
2      2n-1
   n=1


2n


     1
5
     1 - 2


10.


    Notice that if E' 1 an diverges, then L°1 kan must also diverge for any constant
k # 0. This follows because, if, on the contrary, E-1°kan converged, then, by the previous
proposition, so would E°  an since


00
    an2
n=1


n1


(5.3.11)


Proposition    If E°° 1 an diverges, then L=UIkan diverges for any k # 0.

Example In Section 1.3 we saw that the harmonic series


                                         n=1


diverges. It follows that both

                                     3n
                                 n=1
and
                                     9
                                     20n
                                n=1
are divergent series.

    It is also important to note that since


  001

n1


°°  9   1
n120(n)


00
    an
n=1


m-1
    an +
n=1


00
    an
n=m


(5.3.12)


for any positive integer m, the series °°an converges if and only if the series L-m an
converges. In other words, convergence or divergence of a series is never determined by
the behavior of any finite number of terms.

Example It follows from the previous example that

                                        00    9

                                            20n
                                       n=200


diverges.


﻿


4


Infinite Series Revisited


Section 5.3


Example The series


00

Z5n
n=~4


converges. Moreover,


00   3

    512
n=4


00 3

    54(5L-4)
n=4


3    00  1
625     512
    n=0


32
625(1-11)
          5


3
500


    Now suppose the series Lnl0 a12 converges with

                                       oo
                                       an = L.
                                       n=1

Let s = a1 + a2 + a3s+ - - - + an be thernth partial sum of E ,an. Now

                                    an = s -sn-1,

                                           n       00
                           lim sn = lim a =    a2 =L,
                           n-ooo     n-ooo
                                          i=1      i=1
and
                                           n-1      00
                          lim sn 1 = lim a  =    a2 = L,
                          n-0         n-ooo
                                           i=1      i=1


Hence


lim an = lim (s2 - s1-1)
n-oo      n-ooo


lim s1 - lim n-1
n-oo      n-oo


:L-L=0.


(5.3.13)


That is, the nth term of a convergent series must have a limit of 0.

Proposition     If EL   a1 converges, then


lim an = 0.
n--oo


(5.3.14)


    Note that this result only demonstrates a consequence of a series converging, and
so does not provide a criterion to determine convergence. However, it may be useful in
showing that certain series are divergent. Namely, if either the sequence {an} does not
have a limit or
                                       lim a12 # 0,

then the series L°I1 a1 must diverge. This result is often called the nth term test for
divergence.


﻿


Section 5.3


Infinite Series Revisited


5


Example The series


diverges since


Example     The series      1(-

Example     Note that


yet the harmonic series


°°o(1
    cos (-
n=1


          1      os
 lim cos - = cos(0)
n-oo      n


1.


-1)< diverges since {(-1)"n} does not have a limit.

            1
        lim - = 0,
        n-  n


                                             n
                                         n=1
diverges.

p-series
In the next section we will consider a method for determining the convergence or divergence
of a series by comparing a given series with a series which is already known to converge or
diverge. In order to make significant use of such a result it is necessary to have a supply of
series whose convergence or divergence is already known. So far geometric series are the
only series we have studied in any detail. Now we will consider the class of series of the
form


001

    nP
n=1


(5.3.15)


where p is a fixed constant. Such series are called p-series. The following proposition
contains our main result.


Proposition The p-series


001

    nP
n=1


(5.3.16)


converges for p > 1 and diverges for p < 1.
    To demonstrate this result, we shall consider four cases. First, suppose p < 0. Then


lim 1
n-oo nP


,00,
1,


if p <0,
if p= 0.


(5.3.17)


Thus the series diverges by the nth term test for divergence.
    Next, consider 0 < p < 1. Note that for any n> 0, the partial sum


          1    1
sn=1+       +     +...+
         2P    3P         nP


(5.3.18)


﻿


6


Infinite Series Revisited


Section 5.3


1


0.8

0.6

0.4

0.2


2


4


6        8       10


Figure 5.3.1 Rectangles for left-hand rule approximation for


/11


  dx
x


is a left-hand rule approximation, using intervals of length 1, for the integral


1+1


dx.


(5.3.19)


See Figure 5.3.1 for the case p


1 and n = 10. Since


                                        f(x)=1

is a decreasing function on the interval [1, n + 1], so is an upper sum for the integral


(5.3.19), and hence


Now


j- ;> 1 -   dx.
   1/n+1


1+1


   dx
zp


S1-p n+1
1-pi


(ft +1)l-p - 1
     1-p


Thus


(5.3.20)


(5.3.21)


(5.3.22)


(5.3.23)


But, since 1 - p > 0,


        (ft + 1)l-P - 1
   sn > .
             '-p


      (n + 1)1-p - 1
 lim                 = 00.
n-oo      1 - p


Hence {s1} is an unbounded, increasing sequence, and so, from a result in Section 1.2,


lim  s1=2 00.
n-- o


(5.3.24)


﻿


Section 5.3


Infinite Series Revisited


7


1


0.8

0.6

0.4

0.2


2


4


6        8         10


       Figure 5.3.2 Rectangles for right-hand rule approximation for f015dx
                                                                      1   X


In other words,

                                              1
                                              np
                                         n=1

is a divergent series.
    For p = 1, the p-series is the harmonic series and so diverges.
    Finally, consider p > 1. If for any integer n> 1 we let


                                   1     1    1          1
                             =       +     +    +-+        ,                     (5.3.25)
                                  2p    3p   4p         nP

then to is a right-hand rule approximation, using intervals of length 1, for the integral


1/ 1
I pdx.


(5.3.26)


See Figure 5.3.2 for the case p = 1.5 and n = 10. Since


                                       f(x)=1


is a decreasing function on the interval [1, n], to is a lower sum for the integral (5.3.26),
and hence

                                    t  <      p dx.                              (5.3.27)
                                           1


﻿


8


Infinite Series Revisited


Section 5.3


Now
                                I 1          00
                                      dx      I    dx
                                                   1 1
                                            lim      - dx
                                            n-0o l xP
                                                  x1-p n
                                            lir
                                            ne--o 1 -p 1

                                            lim11p-
                                            na-m0  1-p
                                              1
                                           p-i'
where the final equality follows from the fact that, since p > 1,


(5.3.28)


lim n1-P
n-o


       1
 lim       = 0.
n-oo np-1


(5.3.29)


(5.3.30)


Thus
                                              1
                                            p-1
Now if s1 is the nth partial sum of

                                           12~
                                             npc
                                         n=1
then si = 1 and sn = 1 + tn, n = 2, 3, 4, .. .. Hence


  1        p
p-1 p-1


(5.3.31)


for n = 1, 2, 3,.... Thus {s1} is a bounded, increasing sequence, and so, from a result in
Section 1.2, must have a limit. That is, lim sn exists and
                                        12- 00
                                          00
                                              1
                                              np
                                         n=1


is a convergent series.

Example The series


00 1

n 1


diverges because it is a p-series with p
results that the series


       Moreover, it now follows from our earlier


   2n
n=12


﻿


Section 5.3


Infinite Series Revisited


9


and


00 1

L1  
1o


both diverge as well.
Example The series


    Zr2
  n=1
p= 2. Similar to our last example, it now follows


converges because it is a p-series with
from our earlier results that the series


35
    6n2
n=1


and


both converge as well. Moreover, from


   5n2O
 n=20
(5.3.31), we know that


                                      -1     2
                                      n2 - 2 - 1   2
                                 n=1

    In Problem 5 in Section 4.6, you were asked to show that the integral

                                              dx

diverges for p < 1 and converges for p > 1. In Section 6.2 we will see that

                                           - dx
                                        1  x
diverges as well (see also Problem 5 of this section). Combining these facts with our results
about p-series, it follows that
                                             1
                                             np
                                        n=1


converges if and only if


IXdx
/1


converges. This should not be surprising considering the intimate relationship we have seen
between the partial sums of the series and the left-hand and right-hand rule approximations
for the integral. The essential ingredient in making these connections was that the function

                                      f(x)=1


﻿


10


Infinite Series Revisited


Section 5.3


is continuous, positive, and decreasing on the interval [1, oc) when p > 0. In fact, it can be
shown, using arguments similar to those given above, that if g is a continuous, decreasing
function on [1, oc) with g(x) > 0 for all x> 1, then

                                         00
                                         ~g(nt)
                                         n1


converges if and only if


0g(x)dx


converges. You are asked to verify this result, known as the integral test, in Problem 4.


Problems


1. For each of the following infinite series, decide whether the series converges or diverges
    and explain your answer. If the series is a convergent geometric series, find its sum.


(a)ZQ


(c) ~
    ( n
  (e) 1


(b)4
    00o


2n~


5    2
)n n
     3


11
nj


     00
(d) (-3) n
    n=1

(f)     nsin


     (h    3

     n-2


      1000
n=1 O


2. For each of the following infinite series, decide whether the
   and explain your answer.


series converges or diverges


(a)     4
    n=1

(c)


24)


(b)
      15
    n=1
        °° 3
(d)

   (f)t

       n
    n=21

(h)  Z. 2
    =5


     00
(e)     n--
    n=1

(g)
    n=3 2349/


﻿


Section 5.3


Infinite Series Revisited


11


3. Give an example of divergent series E     a1n and L021 bn for which the series

                                       00
                                          (an + bn)
                                       n=1

   converges.

4. Prove the integral test. That is, show that if g is a continuous decreasing function on
   [1, oo) with g(x) > 0 for all x;> 1, then

                                          00
                                          g(    )
                                          1


converges if and only if


g(x)dz


converges.


5. Use the integral test to determine the convergence or divergence of each of the following.


     00
(a)     2+3

      12
(c) - do


     00
(b)        5
  (n= 2g-1

     00    3n
(d)       n2  -
    n=2


6. Find three different examples of divergent series E   an with the property that
    lim an = 0.
    n- oo
7. The following argument has been used to show that


     0
n=o

      00
L =      (-
     n=o


  1
  2


l1n.


Let


Then


L = (- =
     00o


1)n


n=1


n=0


1 - L.


Thus L = 1 - L, and so L


T. Where is the fallacy in this argument?


﻿


Section 5.4


                to                     Infinite Series:
        Differential Equations         The Comparison Test


In this section we continue our discussion of the convergence properties of infinite series.
Now that we have two classes of series, namely, the geometric series and the p-series, for
which classification as either convergent or divergent is relatively easy, it is reasonable to
develop tests for convergence based on comparing a given series with a series of known
behavior. We will see this idea first in the comparison test, which we will later generalize
with the limit comparison test.
    To begin, suppose    0  an is a convergent series with an > 0 for all n and      bn
is a series with 0 < bn an for all n. Let

                                           00
                                      L =     an.
                                          n=1

If sn is the nth partial sum of j: 1 an and to is the nth partial sum of j:i bn, then

                                        to < sn                                  (5.4.1)

for n = 1, 2, 3, .... Since an > 0 for all n, the sequence {sn} is increasing; hence

                                        sn < L                                   (5.4.2)

for all n. Since bn    0, n = 1, 2, 3, ..., {tn} is also an increasing sequence which, by
(5.4.1) and (5.4.2), is bounded above by L. Hence lim to exists, showing that E      bn
converges. Moreover, from (5.4.1),

                                    00       00
                                    bn < 5Y     an.                              (5.4.3)
                                    n=1     n=1

    Now suppose that Z°i1 an is a divergent series with an   0 for all n and   °i° bn is
a series with an  bn for all n. If sn is the nth partial sum of 1°i1 an and to is the nth
partial sum of     1 bn, then
                                        to > sn                                  (5.4.4)

for n = 1, 2, 3, .... Since an  0 for all n, {sn} is an increasing sequence; thus, since
     E an diverges, it follows that
                                      lim sn = 00.
                                      n->o0


1


Copyright @ by Dan Sloughter 2000


﻿


2                  Infinite Series: The Comparison Test                   Section 5.4

Hence, from (5.4.4), and the fact that {tn} is also an increasing sequence, we have

                                     limn =t o,
                                     n-oo

that is, L=1 bn diverges.
    The preceding results are summarized in the comparison test.

Comparison Test       Suppose a    ;> 0 for n = 1, 2, 3, ... and °1an converges. If
0 < bn   an for n = 1, 2, 3, ..., then Ln=1bn converges and


                                   >Zbm        a                               (5.4.5)
                                   n=1     n=1

Suppose an>0 for n = 1, 2, 3,... and E01 an diverges. If an   bn for n = 1, 2, 3, ...,
then Ln1 bn diverges.

    In other words, if the terms of the series L°U1bn are nonnegative and smaller than
the terms of a series which converges, then  IL bn must converge; if the terms of the
series L   1 bn are larger than those of a divergent series with nonnegative terms, then
E0=bn must diverge.
Example The series

                                             1
                                       M1n2 +1
                                     m~ 1
converges since
                                         1       1
                                  0< r2+1
for n=1,2, 3, ..., and
                                         oc
                                       n=1
is a convergent series (namely, a p-series with p = 2).

Example The series
                                            1


diverges since
                                     1       1
                                     >n >0

for n =1, 2, 3, .. ., and

                                          >1


is a divergent series (since it is a multiple of the harmonic series).


﻿


Section 5.4                   Infinite Series: The Comparison Test                  3

Example The series
                                         sin2(n)
                                           n2
                                     n1
converges since
                                       sin2(n)
                                  0<      2       2
for n =1, 2, 3, ..., and


                                        n2
is, as in a previous example, a convergent series.

Example The series
                                         °On + 1
                                           n2
                                      1i
diverges since
                             n+1      (n+1) 1       1
                             n2          n     n   n

for n=1, 2, 3, ..., and

                                            n
                                        n=1
the harmonic series, is a divergent series.

Example The series

                                           n31
                                       n=1
converges since
                                          1     1
                                        n31 - 3n
for n =1, 2, 3, ..., and
                                        oc

                                        >=13

is a convergent series (namely, a geometric series with ratio})

    We know that if a series Li a12 converges, then lim a1 = 0; however, we have
seen numerous examples, such as the harmonic series, which show the latter condition is
not sufficient to guarantee convergence. To ensure that a series with nth term a2 ;> 0
satisfying lim a1 = 0 converges, we need additional information about the rate at which
the terms are approaching 0; namely, we need to know that a12 approaches 0 fast enough to
guarantee that the sequence of partial sums, although increasing, is nevertheless bounded.


﻿


4                   Infinite Series: The Comparison Test                    Section 5.4

The problem lies in determining how to measure rates of convergence to 0 and how to decide
what rates are sufficient for convergence. We have already seen that the comparison test,
by comparing an with the terms of a series of known behavior, provides one way to measure
whether or not an is approaching 0 fast enough for the sum to converge. However, finding
a series to use for the comparison can be somewhat tricky, even when the behavior of the
series is relatively obvious. For example, the series

                                            00 1
                                            n2-1_
                                       n=2

should converge, since for large values of n there is very little difference between

                                           1
                                         n2- 1

and
                                           1
                                           2'

But a direct comparison will not work, since

                                        1       1
                                      n2 _ i>n2

for n = 2, 3, 4,.... Hence it would be useful to have a method for describing the rate at
which a sequence {an} converges to 0 and a test which exploits this description. We will
supply the former with the next definition, a version of the "0" and "o" notation adapted
for sequences, and the latter with the limit comparison test.

Definition If {an} and {bn} are sequences with

                                         b
                                      lim  -n = 0,                               (5.4.6)
                                      n-ooan

then we say bn is o(an). If {an} and {bn} are sequences and there exists an integer N and
a constant M such that
                                        bn
                                        b   <M                                   (5.4.7)

for all n > N, then we say be is O(as).

    Similar to our earlier results, if
                                         him-

exists, then be is O(as) (see Problem 5). Analogous to our earlier use of this notation, if


lim an = 0,
12- 00


﻿


Section 5.4


Infinite Series: The Comparison Test


5


                                       lim b= =0,

and bn is o(an), then bn is approaching 0 faster than an as n - oc; if

                                       lim an = 0,


                                       lim b= =0,
                                       12-- 00
and bn is O(an), then bn is approaching 0 at least as fast as an as n - oo. Of course, if
bn is o(an), then bn is also O(an).


Example


3m2+4 is O (n2) since


         1

 lim 3n2+4
n-o      1
        n2


     . 2
 lim
n-oo 3n2 + 4


lim  4
     3+ 2


1


Example


n1+6 is o ( 1) since


        1

 lim n 4+6
n-o     1
        ft3


lim n
n-oon4+6


        1

 lim  n
n-ol+ 6
      1+


0.


    Now consider two series, E     an and L°1 b , where an > 0 and b ;> 0 for all n, bn
is O(an), and       an converges. Then there is an integer N and a constant M such that


b M
  S<M


(5.4.8)


for all n > N. Hence


for all n> N, so the series


b <; Man

   00
   Z bn
 n=N+1


(5.4.9)


converges by comparison with the convergent series


n=N+ 1


Man.


Thus E, 1 b is also a convergent series.


﻿


6


Infinite Series: The Comparison Test


Section 5.4


    Next consider two series, E 1an and L °1b , where a    ;> 0 and bn > 0 for all n,
an is O(bn), and L    an diverges. Then there exists an integer N and a constant M > 0
such that
                                       an < M                                  (5.4.10)
                                       bn
for all n> N. Hence
                                       be > an                                 (5.4.11)
                                          -M
for all n> N, so the series
                                         00
                                             bn
                                       n=N+1
diverges by comparison with the divergent series
                                        00

                                      n=N+1
Thus L°     bn is also a divergent series.
    The preceding results are summarized in the limit comparison test.
Limit Comparison Test        Suppose an > 0 for n = 1, 2, 3, ... and L=1 an converges.
If bn > 0 for n = 1, 2, 3, ... and bn is O(an), then E° 1 bn also converges. Suppose a ;> 0
for n = 1, 2, 3, ... and L°21 an diverges. If bn > 0 for n = 1, 2, 3, ... and an is O(bn), then
  L= bn also diverges.
    In other words, if the terms of the seriesL  b are nonnegative and approach 0
at least as fast as the terms of a convergent series with positive terms, then Ln=1 bn
converges; if the terms of a divergent series are nonnegative and approach 0 at least as fast
as the terms of the series L°W1 bn, which are positive, then L001 bn diverges.
Example Consider the series

                                         n2-1_i
                                     n=2
As mentioned above, we would expect this series to behave very much like the convergent
series

                                        o2
                                        n=2
Now


                            1
                       .n2_1
                     lim     1
                     n-o  1
                           s2
so n2_1 is (O   ) . Hence


.i     n2
       2
 im
12oonr2_-1


        1
 lim      1= 1
1-         2
         n2


                                      oc
                                          n21
                                      n=2
converges by the limit comparison test.


﻿


Section 5.4


Infinite Series: The Comparison Test


7


Example Consider the series
                                           2n2
                                           n3 + 2'
                                      n=1
Since the nth term of this series is a rational function of n with a numerator of degree
2 and a denominator of degree 3, we might expect this series to behave much like the
divergent series


                                        n=1


Now


        1

 lim    n
n-oo   2n2
      n3 + 2


     n3 +2
 lim
n-oo 2n3


lim( - +
n- oo 2 na


1
2'


so   is   (      . Hence


n1


2n2
n3 +2.


diverges by the limit comparison test.

Problems

1. For each of the following infinite series, decide whether the series converges or diverges
    and explain your answer.


(a) Z
    n=1
    00     2
(c) Z     f-2
    n=3   n-2


     00
(b) Zft3+2


(d) 1


        cos2 (n)
           f4
    n=1
    00

(h)     n+ 2
    1


  oo1
(e) Zr--"
    n=1

(g)(


ft 3


2. For each of the following infinite series, decide whether the series converges or diverges
   and explain your answer.


     00     1
(a) Z
      n2~ sin2 ()
    (c 3nL-1
(C) n~   n+


       °On2+ 1
(b)
  (b)   n3 + 2n
      n-1
(d) 0
         n21
    n=1


﻿


8


Infinite Series: The Comparison Test


Section 5.4


              -n
(e) " (
            n
    n1i
    003n5 + 1
(g)   13n7-2
    n=23n


      2n
n=1
   °°n +1

n=2


3. (a) Give an example of a convergent series °   an and a divergent series El   bn
      with the property that bn C an for all n.
   (b) Give an example of a divergent series L01 an and a convergent series L°1 bn
      with the property that an   bn for all n.
   (c) Comment on why the comparison test does not apply to the series in (a) and (b).

4. Explain why

                                        4x5 - 2
   converges.

5. Show that if
                                          .bn
                                        lim
                                        n-Oo an

   exists, then bn is O(an).


﻿


Section 5.5


                to                    Infinite Series:
       Differential Equations         The Ratio Test


In the last section we saw that we could demonstrate the convergence of a series j:i1 an,
where an > 0 for all n, by showing that an approaches 0 as n -- oc as fast as the terms
of another series with nonnegative terms which is already known to converge. Both of the
techniques developed in Section 5.4, the comparison test and the limit comparison test,
proved to be very useful; however, they both suffer from the drawback of requiring that we
first find a series of known behavior which allows for the proper comparison with the series
under consideration. In this section we shall consider another test for convergence, the ratio
test, which determines whether or not the terms of a series are approaching 0 at a rate
sufficient for the series to converge without reference to any other series. Although this test
does not require knowledge of any other series, it has the limitation of being inconclusive
in certain circumstances. Unfortunately, there is no single test for convergence which is
useful under all conditions.
    The ratio test determines if the terms of a given series are approaching 0 at a rate
sufficient for convergence by considering the ratio between successive terms of the series.
Specifically, suppose an > 0 for n = 1, 2,3, ... and

                                    lim an+1 = p.                              (5.5.1)
                                    n--oo an

If p < 1, then there is an integer N and a number r with p < r < 1 such that

                                      an+1 < r                                 (5.5.2)
                                      an

for all n > N. Then
                                     an+1 < ran                                (5.5.3)
for all n > N, so
                              aN+2 < raN+1,
                              aN+3 <raN+2 <r2aN+1,
                              aN+4 < raN+3 <r3aN+1,
and, in general, for any integer m > 2,

                                 aN+m < rm-1aN+1.                              (5.5.4)

That is,
                                    aN+mn(..
                                    rm-1 < aN+1                                (5.5.5)


1


Copyright Q by Dan Sloughter 2000


﻿


2                      Infinite Series: The Ratio Test                     Section 5.5

for m = 2, 3, 4,.... Letting n = N + m, in which case m = n - N, we have


                                    rn-N-1 < aN+1                                (5.5.6)

for all n> N + 1. Thus an is O(Tn-N-1). Moreover,

                               00                00
                                 1n-N-1_r-N     1n-1(5.5.7)
                              n=1               n=1

converges since   ° r°-1 is a geometric series and 0 < r < 1. Thus E   an converges
by the limit comparison test.
    Now suppose p > 1, in which we include the possibility that p =oc. Then there is an
integer N such that
                                            > 1                                  (5.5.8)
                                        an
for all n> N. Hence an+1 > an for all n> N, and so

                         aN+1 < aN+2 < aN+3 < aN+4 <       .'''                  (5.5.9)

In particular, an > aN+1 for n = N + 2, N + 3, N + 4,.... It follows that either lim an
does not exist or
                                  lim an > aN+1 > 0.                           (5.5.10)
                                  n-oo

Thus En'L1 an diverges by the nth term test for divergence.
    We now summarize the above results.

Ratio Test Suppose an > 0 for n = 1, 2, 3,... and

                                    p    lim an+1 .                            (5.5.11)
                                        n-oo an

Then L=1 an converges if p < 1 and diverges if p > 1.

    The examples below will show that the ratio test is inconclusive if p = 1. Namely,
the third example considers a divergent series for which p =-1 and the fourth example
considers a convergent series for which p =1. Hence some other test will be necessary to
determine the behavior of any series for which the ratio test yields p =1.

Example For the series


if we let


﻿


Section 5.5

   n=12, 3,..., then


In finite Series: The Ratio Test


3


pimurn al±
  n2-oo0an2


     nt+1
urn 3n2+1
      312


             (31
(11 1 + )


1      7
- limK(1+


   1 1
n }3


Thus p < 1 and the series converges.

Example For the series


if we let


0052


     512
 an t+ 2'


n = 1,2,3,..., then


n2-oo0an2


        512+1


1-- 0    512
       ft+2


   r n (t+ 2\  /51+1
1--00 n\t+3) K5n


          2
       1+-
S lim
12-o003
       l+-
          ft


5.


Thus p > 1 and the series diverges.

Example For the harmonic series


the ratio test yields


n1


         1

P~lim ft+
12-00  1
         ft


limn
n2-o00 +1


lim  1
12-o 1 1
       1+


1.


This shows that it is possible for a series to diverge when p


1.


Example


For the convergent p-series


00 1

    f2'
n1


the ratio test yields


          1
p~lim(n+ 1)2
  12-oo01
         f2


        ft2
 lim (t12
12-- 00( + 1


         /2
 lim ( nt 1
n2-o0 ft+1 /


           2
lim
   l+      - 0


1.


This shows that is possible for a series to converge when p: 1


1.


﻿


4


Infinite Series: The Ratio Test


Section 5.5


Example For the series


if we let


  00 3

     n!.
 n=1

      32
an= - _


n = 1,2,3,..., then


       3n+1

 urn (n +1)!
n-o     312


.2- a+1l
n-o an


.n!              31n+1)
n- o ((n + 1)!) (3n


n      3
n-o n + 1


0.


Thus p < 1 and the series converges.

Problems

1. For each of the following infinite series, decide whether the series converges or diverges
    and explain your answer.


     00
(a)     2


(c) Z   1
    n=1

    00 2
(e) Z~
  e)00n2
    n=1i

(g)
    n=1i


        21
(b) n
    n1
    00   71+2
(d)


   00f1
   n=1


2. For each of the following infinite series, decide whether the series converges or diverges
   and explain your answer.


     O     1
(a) Z    /4n- 2
    n2i1
    > ° 3n+5
(c) (2n)!

      °On!n!
(e)     (2n)!


(g)     3n+ 1
    (   2n - 1
    n=1i


(b)Z5
    n=1

  (d) 00 (2n)!
(d) n~n!
    n1


3n + 2
  72n


    n=1

(h)Z
    1


3
5n


3n
5n


﻿


Section 5.5

3. Define


Infinite Series: The Ratio Test


5


        0O 2n
f(x) =Zx
            n
       n=1


   (a) Find the domain of f. That is, find all values of x for which the series

                                            o° 2n

                                                n
                                           n=1
       converges.
   (b) Plot an approximation to the graph of f on the domain found in (a) using

                                               100 2n
                                       f(x)-Z
                                                   n
                                              n=1

4. Define
                                             °O 2n
                                      g(t) = 00
                                            n=0

   (a) Find the domain of g. That is, find all values of t for which the series

                                            o° t2n
                                                n!
                                           n=0
       converges.
   (b) Plot an approximation to the graph of g on the interval [-2, 2] using

                                               50 2n
                                       g(t) Z
                                              n=0

5. Suppose the terms of the series L= an satisfy the difference equation

                                            (n + 1)an
                                    an+1=      2n

   with a1= 10. Does this series converge or diverge? Explain.

6. Suppose, for n = 1, 2, 3, .. ., an;> 0 and

                                     a = lim am.

   Show that Li as converges if a < 1 and diverges if a > 1. This result is known as
   the root test.


﻿


Section 5.6


               to                    Infinite Series:
       Differential Equations        Absolute Convergence


At this point we have limited our study of series primarily to those series having nonnega-
tive terms, the only exceptions being some geometric series and series which are multiples
of series with nonnegative terms. In this section we shall consider the more general question
of series with negative as well as positive terms.
    An important consideration when looking at the behavior of an arbitrary series


                                          an                                 (5.6.1)
                                       n=1
is the behavior of the related series

                                          an .                               (5.6.2)
                                      n=1
Of course, if all the terms of (5.6.1) are nonnegative, then (5.6.1) and (5.6.2) are the same
series. In any case, (5.6.2) has all nonnegative terms, so we may use our results of the
last three sections to help determine whether or not it converges. Suppose that, by one
method or another, we have shown that (5.6.2) converges. Then, since

                                0   an + an    2|an                          (5.6.3)

for any n, we know, by the comparison test, that the series


                                      (an + an)                              (5.6.4)
                                   n=1
converges. Hence
                                 00 0000
                          Y  an =     (an + an|) -    |an                    (5.6.5)
                          n=1      n=1            n=1
converges since it is the difference of two convergent series. That is, the convergence of
(5.6.2) implies the convergence of (5.6.1).
Proposition    If  °   lan converges, then  00 an converges.
Definition   The series 1:i0 an is said to converge absolutely if the series j: lan
converges.
    With this terminology, the previous proposition says that any series which converges
absolutely also converges. We shall see later that the converse of this statement does not
hold; namely, there are series which converge, but do not converge absolutely.


1


Copyright @ by Dan Sloughter 2000


﻿


2                   Infinite Series: Absolute Convergence                    Section 5.6

Example The series

                            (-1)           1   1    1     1
                            n2             4   9    16   25
                        n=1

converges absolutely since the series


                                  n=1           n=1

converges. In particular, it follows that

                                           (-1)n
                                             n2
                                       n=1

converges.

Example The series

                       00(-1)n+      1   i      I    I   I
                             n           2    3   4    5   6
                      n=1

known as the alternating harmonic series, is not absolutely convergent since

                                  00  (1)n+1       001

                                         n            n
                                 n=1              n=1

is the harmonic series, which diverges. Hence the previous proposition does not provide any
information on the behavior of the alternating harmonic series itself. We shall see below
that in fact the alternating harmonic series converges even though it is not absolutely
convergent.

    In general, determining whether a series which is not absolutely convergent is conver-
gent or divergent is a difficult problem. However, there is one particular type of series for
which we have, under certain conditions, a simple test. These series are the alternating
series, the series which, like those in the previous examples, alternate in sign from one
term to the next.

Definition A series in which the terms are alternately positive and negative is called an
alternating series.

    Now suppose Ln"2i a12 is an alternating series which satisfies the following two condi-
tions:
                           (1) alan1 <; la| forn =1, 2, 3,. . .,
                           (2) lim la1|    0.


﻿


Section 5.6


Infinite Series: Absolute Convergence


3


For the sake of the discussion we will assume that ai > 0, although that will not affect our
conclusion. If sn is the nth partial sum of this series, then

                                        si= ai,                                  (5.6.6)

and, since a2 <0,
                              s2 = ai + a2 = si + a2 <s1.                        (5.6.7)

Next, since a3 > 0,
                           s3= ai + a2 + a3 =   2 + a3 > 82.                     (5.6.8)

Moreover, condition (1) implies a2 + a3 < 0, from which it follows that

                         s3 = ai + a2 + a3 = si + a2 + a3 <81.                   (5.6.9)

Thus we have s2    83< s. Next,

                                   s4   83 + a4 <s3                            (5.6.10)

since a4 <0 and
                                 84 = 82+a3+a4;> 82                            (5.6.11)
since a3 + a4 > 0. Thus s2   84   83   81. For the next step,

                                   85 =4 + a5 > 84                              (5.6.12)

since a5 > 0 and
                                 8s =83+a4+ as 83                              (5.6.13)
since a4 + a55 0. Thus s2    84 < 85   83 <8 1. Continuing in this way, we see that

                                 S2-8 4< -836< -835< 33 31 82(5.6.14)

and
                           S2 -84< -86< -87< -85 s3 -831.                       (5.6.15)
In general, for any positive integer n,

                 82 <8 4  ...< 82n  '...8 2n-1 <...85<8 33  8 1.    (5.6.16)

That is, for nr= 1, 2, 3,..., {s2m} is a bounded increasing sequence and {s22-1} is a
bounded decreasing sequence. Thus both sequences have limits, say

                                      lim s2m   L                               (5.6.17)

and


lim s20-10= M.
n --oc


(5.6.18)


﻿


4


Infinite Series: Absolute Convergence


Section 5.6


But then

          L - M = lims n - lir s2-1 =urn (s2n - S2n-1) = lim a2n = 0,
                    2-00       n-00          n-00                 n-00

where the final equality follows from condition (2). Hence L = M, so

                                       lim s = L.                                (5.6.19)
                                       n2- 00

In other words, L  1 an converges. This conclusion, known as Leibniz's theorem, gives a
simple criterion for determining the convergence of some alternating series.

Leibniz's theorem    Suppose En1 an is an alternating series for which an+1|I <; an|
for n = 1, 2, 3, .... If
                                      lim  l an|I = 0,                           (5.6.20)
                                      n-oo
               00- 0
then L    l an converges.

Example The alternating harmonic series

                                         0(-1)n+1
                                              n
                                      n=1

satisfies the conditions of Leibniz's theorem: If we let


                                        1 i)n+1
                                     and

n = 1, 2, 3, ..., then

                               |an+1| =        <  - =Ian|
                                        n + 1 n
and
                                                  1
                                 lim l an|I = lim - = 0.
                                 n-00       n-00 n
Thus, as we claimed earlier, the alternating harmonic series converges.

Definition A series which converges but does not converge absolutely is said to converge
conditionally.

    The previous example shows that the alternating harmonic series is an example of a
series which converges conditionally.
    From the discussion prior to Leibniz's theorem, we see that if LIL as satisfies the
conditions of Leibniz's theorem, ai > 0, so is its ntth partial sum, and

                                            oc
                                       s =     an,                               (5.6.21)
                                           1


﻿


Section 5.6


Infinite Series: Absolute Convergence


5


then we must have

                   s2s4      --<s6 <..-     s<     5-   s3 s1.                (5.6.22)

Note that for any positive integer n we have sn+1< s < sn if n is odd and sn s < sn+1
if n is even. Thus, in either case,

                            s - sm < Isn+1 - sl = |an+1| 1(5.6.23)

a result which also holds if a1 < 0

Proposition    Suppose E° an is a convergent alternating series for which lam1    an
for n = 1, 2, 3, .... If
                                          00
                                      s =    a                                (5.6.24)
                                         n=1
and
                                           m
                                     s  =    aj,                              (5.6.25)
                                          j=1
then, for any n = 1, 2, 3, ...,
                                  |s - sn l < |an+1|I.                        (5.6.26)

    Hence for those alternating series which satisfy the conditions of the proposition, the
error committed in approximating the sum of the series by a particular partial sum is no
greater in absolute value than the absolute value of the next term in the series.

Example For the alternating harmonic series, if

                                          (-1)n+1
                                             n
                                      n=1

and
                           1   1   1         (-1)n+1       (_1)i+1
                 sn=1--+ -+---+=.
                          2    3   4            n
                                                       j=1
then
                                               1
                                   Is -snl< < -

for nf= 1, 2, 3,...For example

                             1  11             1     1
                             2   3   4        99 -100     06812
so
                                          1
                             s-sl -       -    0.009901,
                                         101


﻿


6


Infinite Series: Absolute Convergence


Section 5.6


where both results have been rounded to 6 decimal places. In other words, the sum of the
alternating harmonic series differs from 0.688172 by less than 0.009901. In fact, since the
next term in the series is positive, we know that s must lie between 0.688172 and

                            0.688172 + 0.009901 = 0.698073.

We will see in Section 6.2 that the sum of the alternating harmonic series is exactly the
natural logarithm of 2, which, to 6 decimal places, is 0.693147

Problems

1. For each of the following infinite series, answer the questions: Does the series converge
    absolutely? Does the series converge conditionally? Does the series converge?


(a) 3 I~-
           f3


(c)     3n2 - 1
    n i


      (10 1) n

    n12


(d)       n!
       nt!


   (e) 5                                       (f)    (-j1)"7r
       n-i   n +3                                 n-3

     (8)     24   * n!(h)                          0    _3    -
       n=                                         n=13
2. For each of the following infinite series, answer the questions: Does the series converge
   absolutely? Does the series converge conditionally? Does the series converge?


           (-1)n(+1(2 + 1)
   (a)  3      3n5 - 2
       n i
       00 32n
   (c)     (2n)!

           (-1)2r2n
   (e)       (n)

         n= 0)   (- 1)"(n  + 1)

      (E) 2n - 1
      n=1
3. (a) Approximate


       using


     (b)       53
         15n
         00 ( i)Tn22nh+1
         d n   (2n + 1)!

         00 (-2)n
      (f)


 S/n +
     ()       2n2
         n=2


     n~


sis =n!
   n=0


﻿


Section 5.6                   Infinite Series: Absolute Convergence                 7

    (b) Find an upper bound for the error in approximating s by s15.
    (c) Find the smallest n such that the absolute value of the error in approximating s
       by


                                            j=0
       is less than 0.000001. What is this approximation?
 4. (a) Approximate
                                          s(-1)n+1
                                          s = n2n

       by
                                            50 (_i)n+1
                                     s50 ZL   n2n
                                           n=1

    (b) Find an upper bound for the absolute value of the error in approximating s by s50.
    (c) Find the smallest n such that the absolute value of the error in approximating s
       by
                                            n (-1)j+1
                                         Sn j=1    d

       is less than 0.0001. What is this approximation?
 5. In our development of Leibniz's theorem, we assumed that ai > 0. Discuss the changes
    which must be made in the discussion if ai < 0.


﻿


        ujjerence Equanons            Section 5.7
                to
       Differential Equations         Power Series


We are now in a position to pick up the story we left off in Section 5.2: the extension of
Taylor polynomials to Taylor series. We shall see that a Taylor series is a type of infinite
series whose nth partial sum is a Taylor polynomial. Such series are examples of power
series, objects that we will study in this section before considering Taylor series in Section
5.8.
Definition An infinite series of the form


n=0


an (x - C)n = a0 + a1(x - c) + a2(x - c)2 + ...


(5.7.1)


is called a power series in x about c.
Example The infinite series

                              o n
                              n! =
                           n=0


         2    3
1++ 2! + 3!+


is a power series in x about 0. Note that if we let

                                        xn     xn
                                  bn  1, 2
                                        n!      n!

for n = 0, 1, 2, ..., then


.i   b n + 1 b
him
n-o b


   |z n+ 1
lm(n + 1)!

     n!


lim
non + 1


0


for any value of x. That is, by the ratio test, the series is absolutely convergent, and
hence convergent, for any value of x. Thus if we define a function, called the exponential
function, by


exp(x)   S
         n=0


(5.7.2)


then this function is defined for all values of x. We shall have much more to say about this
function, which may be thought of as the simplest "infinite" polynomial which is defined
for all real numbers, in Chapter 6.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Power Series


Section 5.7


    Notice that the convergence of (5.7.2) for all x implies, by the nth term test for diver-
gence, that


lim      =0O
n-oo n


(5.7.3)


for any value of x. We have seen particular cases of this limit in the past, but this is the
first time we have had a simple proof that it is always 0.

Example      Recall that the Taylor polynomial of order 2n + 1 for sin(x) at 0 is

                                            (  (-1)kx2k+1
                               Pen+1(x)=        (2k + 1)!
                                           k=o

Hence P2n+1() is a partial sum of the power series


  °(-1)1 z2k+1

k= (2k + 1)!


(5.7.4)


If, for k = 0, 1, 2, . . ., we let


then


  _ (-1)kx2k+1
       (2k + 1)!


       Ic|2k+3
  .   (2k + 3)!
:lhm            =k+
k-noo  |CI2k+1    k
      (2k + 1)!


    cc2k+1
    (2k + 1)!'


    . |x|2
lim         Xc
  oo (2k + 3)(2k + 2)


   .bk+1
 lim
k- oo bk


0


for all values of x. Thus, by the ratio test, (5.7.4) is absolutely convergent, and hence
convergent, for all values of x. Moreover, from our work in Section 5.2, we know that


sin(x) - P2n+1 (X)|


c |2n+3
(2n + 3)!


for all values of x. Since (using (5.7.3))


lim  cc 12+3
n-oo (2n +3)!


0,


it follows that


lim | sin(x)
12- 00


P2n+1(X)| = 0


for all values of x. Hence


sin(x) =  lim P2n+1(x)
         12-- 0


﻿


Section 5.7


Power Series


3


for all values of x. That is, for any value of x,

                                         00(-1)xk2k+l
                                   ~~   Z   ~(2k + 1)!(57)


Example     The series E     ° o x is a power series in x about 0. From our work on geometric
series, we know that this series will converge absolutely when -1 <cc < 1 and will diverge
otherwise. In fact, we have seen that
                                             00

                                    1 - X
                                             n=0

for -1 <cc < 1.
    These examples show that some functions may be expressed as power series. Such
functions are examples of analytic functions, which we now define.

Definition If f is a function for which there exists constants ao, a1, a2, ... such that
                                         00
                                 f (x)  Lan(x - c)"                              (5.7.6)
                                        n=0

for all values of x in some open interval about c, then we say f is analytic at c. If for some
h > 0 the equality (5.7.6) holds for all x in the interval I = (c - h, c + h), then we say f
is analytic on I and we call
                                     oc
                                     Zan(cc -c)
                                     n=0
a power series representation of f on I.
Example From the previous examples we see that

                                     f (x) = exp(x)

and
                                     g(x) = sin(x)

are analytic on (-oc, oc) and
                                              1
                                     h(cc)   1-c

is analytic on (-1, 1).
    Before we can work effectively with power series we need to consider their convergence
behavior. First note that the power series


                                      Zan(cc - c)Th                              (5.7.7)
                                    n2~o


﻿


4


Power Series


Section 5.7


converges at x = c since in that case all terms after the first are 0. Next, suppose the
series converges at a point x = c+r, where r > 0. That is, suppose E°°0 a arn converges.
Then, by the nth term test for divergence,

                                      lim anrn = 0.                               (5.7.8)


In particular, there exists an integer N such that

                                        anr"| < 1                                 (5.7.9)

for all n> N. Hence for any c and n> N,


                                        ac   =-anr| <1.                          (5.7.10)
                                  cc-c
                                    r

In particular, la1(c - c)T| is
                                       O X -C


Now if c - r <xc <c + r, then
                                        x-c
                                               <1
                                          r

and
                                           x - C
                                             r
                                      nO
is a convergent geometric series. Thus, by the limit comparison test,


                                         an(x - c)
                                     n=0

converges absolutely. In other words, we have shown that if (5.7.7) converges at c + r with
r > 0, then it converges absolutely for all x in (c - r, c + r). The same argument works to
show that if (5.7.7) converges at c - r with r > 0, then it converges absolutely for all x in
(c - r, c + r). Letting R be the largest real number such that (5.7.7) converges absolutely
for all cc for which cc - cl < R, where we allow R =oc if (5.7.7) converges for all cc, it
follows that

                                       Za1(cc -c)


converges absolutely on (c - R, c + R) ((-oo, oc) if R = 0) and diverges for all cc with
cc - c| > R.


﻿


Section 5.7

Proposition For a power series


Power Series


5


    an(x -c),
n=O


there exists an R, with R = 0, R > 0, or R =oc, such that the series converges absolutely
for all x satisfying |x - cl <R and diverges for all x satisfying |x - cl > R.
Definition With the notation of the previous proposition, the interval (c - R, c + R)
((-oo, oo) if R =0oc) is called the interval of convergence and R is called the radius of
convergence of the power series.
    Note that the proposition does not say anything about the behavior of the series at
x = c - R or x = c + R. In fact, any type of behavior is possible at the endpoints of the
interval of convergence; for a given series, these points must be checked individually for
convergence. Moreover, although the proposition does not provide a method for finding
the interval of convergence of a series, the next examples illustrate that the ratio test is
very useful in this regard.
Example Consider the power series

                                                                               (5.7.11)
                                           n2n
                                    n=1

If we let


      (-1)x n+1x"
b_ -=~ )2l1
         nt2T


cc12


then


   .bn+1
 lim
n-oo b


         I12n+1
  .   (n + 1)2n+1
  ham
n-o      1X|n
         n2n


12i 'c O(n2 1   I  c2


Hence, by the ratio test, (5.7.11) is absolutely convergent when

                                        Hc 1
                                          2<1,

that is, when -2 < x < 2. Thus the radius of convergence is R
convergence is (-2, 2). Now at x = -2, (5.7.11) becomes


2 and the interval of


1(-1)n+(-2)
         n2"
n=1


12n~1


1


n1


1
n'


which is a multiple of the harmonic series and hence divergent. At x = 2, (5.7.11) becomes


S   (- )n+12n
       n21
n=1


1


﻿


6


Power Series


Section 5.7


which is the alternating harmonic series and hence convergent, although not absolutely
convergent. Putting this together, we see that the power series

                                       0(-1)n+1xn
                                            n2n
                                    n=1

converges absolutely for all x in (-2,2), converges conditionally at x = 2, and diverges for
all other x.

Example In the first example of this section we used the ratio test to show that the
power series
                                         00   m

                                         n=1
converges absolutely for all values of x. Hence in this case the interval of convergence is
(-oc, oc) and the radius of convergence is R = oc.

Example     Consider the power series E° 1n!xm. If we let

                                  b1 = In!x"| = n!|cxl,


then


   .bn+ 1
limn-oo b


lim (n + 1)!|xn+l
n2-oo    n!|cc 12


lim (n + 1)|cx|
n-oo


0,
00,


if x =0,
if x #0.


Hence, by the ratio test, this power series converges only when x
radius of convergence is R = 0.

Example Consider the power series


0. Accordingly, the


121


(5.7.12)


If we let


bn - (x - 1)n
         n


n- '


then


limbn+1
n-oo bn


      x - 1|n+1

 lim n + 1
n-oo |x - 12n
          n


lim      n     i
n-oo n + 1


1 = Ix - ll.


Using the ratio test, we see that (5.7.12) converges absolutely when x - 1 < 1. Thus the
radius of convergence is R = 1 and the interval of convergence is (0, 2). At x = 0, (5.7.12)
becomes


tm1


Z (-)n+1
n=1


﻿


Section 5.7


Power Series


7


which is a multiple of the alternating harmonic series and so converges conditionally. At
x = 2, (5.7.12) becomes

                                          n
                                      n=1
which is the harmonic series and so diverges. Hence the power series

                                      00(X - 1) n
                                          n
                                   n=1

converges absolutely for x in the interval (0, 2), converges conditionally at x = 0, and
diverges for all other x.
    A power series resembles a polynomial; in fact, often it is convenient to think of a power
series as a polynomial of infinite degree. Among the many nice properties of polynomials
is the ease with which they may be differentiated and integrated. Our next result states
that power series may be differentiated and integrated term by term in the same manner
as polynomials. Although we have the tools to provide justifications for these statements,
they are technical and perhaps best left to a more advanced text.

Differentiation and integration of power series Suppose the radius of convergence
of the power series
                                   oc
                                      an(x - c)
                                  n=O
is R > 0 and let
                                       00
                               f (x)     an(x - c)"
                                      n=0
for x in (c - R, c + R). Then


                    f'(x)  ZJan(x -c)" =         nan(x - c)--i             (5.7.13)
                           n=0                n=1

for all x in (c-R,c+R) and


                 b0               b                           _a (xn+ 1(5.7.14)


for all b in (c -R, c+ R).

Example Recall that the interval of convergence of


                                 exp(x)
                                          z~


﻿


8


Power Series


Section 5.7


is (-oc, oo). Hence


d exp(x)
do


d 00x"
do     n!
   n=0


nd         )n!


                                               non-1
                                               n!
                                          n=1
                                          00 X n-1

                                          (n - 1)!


                                          n=0
                                          exp(x)

for all x in (-oc, oc). That is, the function exp(x) is its own derivative. We will have much
more to say about this interesting property of the exponential function in Chapter 6.


Example From our work above we know that

                                           9(-1)z2n+1
                              sin(x) =


for all x in (-oo, oo). Now


Ix sin(t)dt


      x
cos(t)
      0


cos(x) + 1


for any x. However, we also know that


J   sin(t)dt
/x0


Lx       (_in~+1
     o    (2n + 1)!
 oc   x ( 1)nt2nh+1
                   dt

       S (-1)t2n+2   x
  n (2n +2) (2n +1)! o
  n = ° ( - 1 ) x 2 n + 2

n=0 (2n + 2)!
S   (- 1) -  2n
n=1  (2n)!


﻿


Section 5.7                               Power Series                                9

Hence
                              cos(x) = 1 - J  sin(t) dt

                                               o _ n-1  2n - 0 ( -) n
                                       1 -        (2n)!
                                          n1

                                    = 1 + j       2)

                                       00(-1)2n

                                       n=0   (2n)!
for all x in (-oc, oc). Thus we have found a power series representation of cos(x) on
(-oc, oo). In particular, cos(x)is analytic on (-oc, oo).

    To close this section we note that a power series representation of a function about a
specific point c is unique. To see this, suppose

                                         00
                                 f (x) =    an(x - c)(5.7.15)
                                        n=0

on (c - R, c + R), where R > 0 is the radius of convergence of the power series. We need
to show that the coefficients an, n = 0, 1, 2,..., are uniquely determined by f. To start,

                          00
                   f (c) =Za(c-c)"
                          n=o                                                   (5.7.16)
                       = ao + a1(c - c) + a2(c - c)2 + a3(c - c)3 + ...
                       = ao,

so
                                       ao = f (c).                              (5.7.17)

Next,
                                     o0
                            f'(c) -=~   nta(c - c)"--h   ai,                    (5.7.18)
                                    1
so
                                       ai - f'(c).                              (5.7.19)

For a2 we have
                                 00
                        f"(c) =3 n(n - 1)an(c - c)n--2 =2a2,                    (5.7.20)
                                n=2
so
                                      a2- f= (.                                 (5.7.21)


﻿


10                               Power Series                                Section 5.7

In general, for k = 0,1, 2,...

                           00
                f(k)(c)      n(n - 1) ... (n - k + 1)an(c - c)T-k= k!ak,     (5.7.22)
                          n=k

from which it follows that

                                      ak =f(k)C)                                 (5.7.23)

As a consequence, the power series representation of f about c is uniquely determined by
the values of the derivatives of f at c.

Proposition Suppose
                                          00
                                  f (x) =    an(x - c)"                          (5.7.24)
                                         n=O

on (c - R, c + R), where R > 0 is the radius of convergence of the power series. Then

                                          _f(2) (c)
                                                 an = (5.7.25)
                                              n!

for n = 0, 1, 2, ....

    Note that the coefficients an as given by (5.7.25) are the same as the coefficients used
in the definition of the Taylor polynomial of f at c. This observation leads immediately
to the question of extending Taylor polynomials to Taylor series, the topic of Section 5.8.
    Our final example of this section illustrates how (5.7.25) may be used to find the
derivatives of f at c if we already know a power series representation for f about c.

Example In a previous example we saw that if

                                               1
                                     f W) =        ,
                                             1-he

then
                                              oc
                                      f (x) Z    x"


for all cc in (-1, 1). In this series, the coefficient of cc" is 1 for all nt, so, by the previous
proposition,
                                          _f(m) (0)
                                       1-=


for n= 0,1,2.That is, f(T)(0) = n! for all n.


﻿


Section 5.7


Power Series


11


Problems

1. For each of the following power series, find the interval of convergence and determine
    the behavior of the series at the endpoints of the interval. State clearly where the series
    converges absolutely, where it converges conditionally, and where it diverges. Also, for
    each series write out the first 5 nonzero terms.


(a)    x
        n
    n=1i
    00
(c) nx1


    (      -(e) ( 2 )n+ 1
    n 0o

(g) Z    X2
    n12


      0(-1)n+1xn
(b) ~(1T~c~
    n 1
    00ox2n

    n=0
(d) Z  r_1)n

           3n
    n=0
    00
(h)     3x2n
    n=1


2. For each of the following power series, find the interval of convergence and determine
   the behavior of the series at the endpoints of the interval. State clearly where the series
   converges absolutely, where it converges conditionally, and where it diverges. Also, for
   each series write out the first 5 nonzero terms.


       ()  (x - 3 )n
   (a)          )n!
       n=0
       00   2n+
   (c) 2n
       n=0
          °O 2n+1
   (e)       n
       n12i
       00
   (g)Z    3"nz"
       n=1
3. (a) Using the fact that


(b)  Z n
    n=1
    00 1_) n+1I(X - 6)n
    (d) n3n
    n=1
    oo)00 (_1)nh(x  -  )2nh
    n=0      (2n)!I
        °° (-1)xz2n+
(h)
  (h)0 (2n + 1)(2n + 1)!


  1
1-x


00
  Zccn
n=0O


for -1 <cc < 1, find a power series representation about 0 for


                                          1+cc


    on (-1, 1).
(b) Use your result from (a) to find f 35)(0).


﻿


12                               Power Series                                Section 5.7

    (c) Use your result from (a) to find a power series representation about 0 for


                                                 1+dt
                                            / 1+ t


        on (-1, 1). Determine where the series converges absolutely, where it converges
        conditionally, and where it diverges.
    (d) Use your result from (c) to find an infinite series representation for


                                          Ji  j     dt.

        Use this series to estimate the integral with an error of no more than 0.001 in
        absolute value.
 4. Use the power series representations of sin(x) and cos(x) about 0 to prove the following
    identities.
    (a) sin(-x)   - sin(x)                      (b) cos(-x) = cos(x)

    (c)    sin(x) = cos(x)                      (d)    cos(x)= - sin(x)
        do                                          dz
 5. Using the power series representation of cos(x) about 0, find an infinite series repre-
    sentation of cos(1). Use the infinite series to estimate cos(1) with an error of no more
    than 0.000001.
 6. Use the fact that
                                   d      11
                                   d    1- (11-)1)2
    to find a power series representation about 0 for

                                      g(x)       1
                                                (i )2

    Find the interval of convergence for this power series and determine the behavior of
    the series at the endpoints.
 7. Use your result from Problem 6 to evaluate


                                         1

 8. (a) A fair coin is tossed repeatedly. In Section 1.3 we saw that the probability that a
        head appears for the first time on the ntth toss is

                                                  1
                                            PTJ =


﻿


Section 5.7


Power Series


13


    for n = 1, 2,3,.... The average number of tosses before the first head appears is
    then given by
                                            00
                                      A =      nPn .
                                           n=1
    Use Problem 7 to find A.
(b) A manufacturer of circuit boards tests every board as it comes off the assembly
    line. If the probability that a board passes the test is p and the probability that
    it fails is q = 1 - p, then the probability that n boards are tested before the first
    defective board is encountered is P  = p"-lq. The average number of boards
    tested before finding a defective one is then

                                            00
                                      A =      nPn.
                                           n=1


Find A.


﻿


Section 5.8


       Differential Equations          Taylor Series


In this section we will put together much of the work of Sections 5.1-5.7 in the context of
a discussion of Taylor series. We begin with two definitions.

Definition   If f is a function such that f(n) is continuous on an open interval (a, b) for
n = 0, 1, 2, ..., then we say f is Coo on (a, b).

Definition If f is C' on an interval (a, b) and c is a point in (a, b), then the power
series

      (    (   - c)  = f(c) + f'(c)(_ - c) +                      (x ( - c) + (x - c)3 + ... (5.8.1)
  n=0

is called the Taylor series for f about c.

    A Taylor series is a power series constructed from a given function in the same manner
as a Taylor polynomial. As with any power series about c, the Taylor series for a function
f about c converges at x = c, but does not necessarily converge at any other points. If it
does converge for other values of x, it will converge absolutely on an interval (c - R, c + R),
where R is the radius of convergence. However, even if the series converges at x  c, it
need not converge to f(x). That is, a function may be C' without being analytic. (See
Problem 12 of Section 6.1 for an example.) If the Taylor series does converge to f(x) for
all x in the interval of convergence, then it is the unique power series representation for f
on this interval.
    If Pn is the nth order Taylor polynomial for f at c, then Pn is a partial sum of the
Taylor series for f about c. Hence to show that the Taylor series converges to f at x, we
need to show that
                                  f (x) = lim Pn (x).                           (5.8.2)
                                         nl-* 00

Equivalently, we need to show that

                                    lim rn (x) = 0,                             (5.8.3)


where
                                rn(x) = f (X) - P(x).                           (5.8.4)

In this regard, the error bounds for rn (x) developed in Section 5.2 can be very useful.


1


Copyright @ by Dan Sloughter 2000


﻿


2                               Taylor Series                               Section 5.8

Example     For any n = 0, 1, 2,..., if P2n+1 is the Taylor polynomial of order 2n + 1 for
f(x) = sin(x) at 0, then

                              P2n+1(x) =n  .(2k1
                                          k~O
In Section 5.2 we saw that if

                             r2n+1(x) = sin(x) - P2n+1(X),

then
                                               X2n+3
                                               (2n +3)!

for any value of x. In Section 5.7 we saw that, for any x,

                                   lim            = 0
                                   no (2n2 + 3)!
so
                                   lim  r2n+I(x)|   0.

Hence
                                 sin(x) = lim P2n+1(x)

for all x. That is,

                               0(1)"z2n+1         X3    X5   X7
                  sin(x) =                  = x -    +           + -             (5.8.5)
                                 (2n + 1)! 3! 5! 7!

for all x. Thus the Taylor series for sin(x) about 0 provides a power series representation
for sin(x) on the interval (-oc, oo). Note that this example is essentially a restatement of
our second example in Section 5.7.
    In many cases showing
                                     lim rn(x) = 0                               (5.8.6)

is difficult. However, since power series representations are unique, if we are able to find
a power series representation for a given function by manipulating some other known
representation, then we know that this series is the Taylor series for that function. This is
in fact the way many Taylor series representations are found in practice.

Example Since

                         1f          z"=~h1+x+x2+x3+---

for -1 < x < 1, it follows that


           1       1  1 - (-x).
               1+cc 1-(        nc)         n2~O


﻿


Section 5.8


Taylor Series


3


for -1 < -x < 1, that is, -1 <cc < 1. Hence we have found a Taylor series representation


for


          1
f (x)=
        1 +x


on (-1, 1).
Example


Similar to the previous example, we have


  1
1 + x2


    1
1 - (-x2)


n=O


00
   (-1)"c2n
n=O


1 - x2 + x4 - c6 + ...


for -1 < x2 < 1, that is, -1 < c < 1. Thus we have found a Taylor series representation
for
                                               1
                                    f c     1) =+1
                                            1 + x2


on (-1, 1).
Example


In Section 5.7 we saw how the relationship


cos(x) = 1


lx
I   sin(t)dt
0


combined with the Taylor series representation


sin(x)   Z
         n=0


(-1)nc2n+1
(2n + 1)!


yields


          00 (1)nc2n
cos (X ) =(2 )!
         n=1 2


    x2
1   2!


  x4
+4!


x6
6!    --


(5.8.7)


for all values of x. Thus (5.8.7) is the Taylor series representation for cos(x) about 0 on
(-oc, oo0).


Example


Since


           °(-1 2n+1
sin(x)  =o:( 2 +1
         n1 2n +1)


for all values of x, it follows that

                   sin(x)       (-1)xc2 n
                          x      (2n + 1)!


    x2
1-
    3!t


  x4
+ 5


x6
7!


for all x # 0. In fact, if we define


sin(x)

  i,


if x  0,

if x=0,


﻿


4


Taylor Series


Section 5.8


then the Taylor series representation for f about 0 on (-oc, oc) is given by


00 (-1)nz2n

n=0 (2n + 1)!


    x2   x4
1 -    +
    3!   5!.


x6
7!


(5.8.8)


Example Since


for -1<x<1,


  1
1-x


00
  Zccn
n=0O


d     1
dz (1-cx


    Z n
n=0


   *d ct
   0
       xn
n=0


    1
 (1-cx)2'


00
    2n-1
n=1


for -1 <cc < 1. But
                               d 1
                               dc(1± -x)

so we have the Taylor series representation


(11 )2  n- =
          2 n=1


1+2x+3x2 +4x3+---


for all x in (-1, 1).

    The final two examples of this section will illustrate the use
problems that we could not handle before.


of Taylor series in solving


Example Define


f(x) {


sin(x)
  x


if c#0,


Then. as we saw above.


1, if x = 0.


      x2 x4
         +
      3!    5!


00 (-1)nc2n

n=0 (2n + 1)!


x6
7!


is the Taylor series representation for f about 0 on (-oc, oc). Now f is continuous on
(-oc, oc) and so has an antiderivative on (-oc, oc), but, as we have mentioned before, this
antiderivative is not expressible in terms of the elementary functions of calculus. However,
by the Fundamental Theorem of Calculus, the function


                                 Si(x)   f  f(t)dt,                           (5.8.9)


﻿


Section 5.8                              Taylor Series                            5

called the sine integral function, is an antiderivative of f. Moreover, even though we
cannot express this integral in terms of the elementary functions, we can find its Taylor
series representation. That is,


                            f            Si(x) dt
                                00  fx (-1)t2n

                                -  o   (2n+1)! dt

                                n=o (2n + 1) (2n + 1)! 0

                                   00 (-1)"x2nh+1

                               n=o (2n + 1) (2n + 1)!
                                     .T3     5       7
                               =  x -    +x            +---                  (5.8.10)
                                    3.3!   5-5!   7.7!

for all values of x. In particular,

              Si1)=d(    =00        (-11-                                    -1 1 -
    Si(l)    inx       d    z(2n + 1)(2n + 1)! 3-3!+ 5 -5!           7.7!


Since this is an alternating series which satisfies the conditions of Leibniz's theorem, if


                             s1 = (2k + 1)(2k + 1)!'


then
                                                1
                           |Si(1) - sul <.
                                        (2nr+3)(2nr+3)!
For example, if we want to approximate Si(1) with an error of no more than 0.0001, we
note that for n = 1 we have, to 6 decimal places,

                             1            1      1
                     (2n + 3)(2n + 3)!  - 55! - 6 0     .0167

while for nr= 2 we have

                           1             1   _   1    _
                    (2n + 3)(2n + 3)!  7*-7!   35, 280=0.028

Thus
                                     1      1
                           s2- 1   - 3!+      -   0.946111


﻿


6


Taylor Series

          2 r


Section 5.8


1.5
  1
0.5


2        4


Figure 5.8.1 Taylor polynomial approximation to the graph of y


Si(x)


differs from Si(1) by no more than 0.000028. In fact, since the next term in the series is
negative, Si(1) must lie between 0.946111 and

                            0.946111 - 0.000028 = .946083.


In particular, we know that


Si(1) = 0.9461


to 4 decimal places. Of course, this particular result could also be obtained using numerical
integration. However, the point is that (5.8.10) gives us much more; it not only gives us
an easy method to evaluate Si(x) for any value of x to any desired level of accuracy, but
it also gives us an algebraic representation of the sine integral function which can be used
in applications in much the same way that polynomials are used. In Figure 5.8.1 we have
used the Taylor polynomial


                3      5
Pii(x) x- 3! 55!


  7       9
7.7!    9-9!


  x11
11 11!


to approximate the graph of Si(x) on the interval [-5, 5]. Note that on this interval

                                              513
                           ISi(x) - P11j(x)| 11  = 0.0151
                                          - 13g r 1t

to 4 decimal places, certainly accurate enough for the purposes of our graph.


Example Using


1         1
x  1-(1-x)


and


  1
1-x


00
  xn
n2-0


﻿


Section 5.8


Taylor Series


7


for -1 <cc < 1, we have


1    0
-       (1I- x)"
    n=0O


Z(-1)2(     - 1)12
n2~O


(5.8.11)


for -1 < 1-x < 1, that is, 0 < c < 2. Hence (5.8.11) gives the Taylor series representation
for
                                                1


f(x)= -


about 1.
for f on


Similar to our work in the previous example,
(0, 2) by integration. Namely,


we may now find an antiderivative


[1dt


1    0


1)) dt


    j  (-1)"(t -  1)dt

 j (-i)(t - 1)n+1x
nLO      n+1         1
j (-1)(x - 1)n+1

2                  + +
   x-1 (x - 1 )2+ (x - )3
             2           3


(c - 1)+
   4


provides a Taylor series representation for an antiderivative of f on the interval (0, 2).
In Chapter 6 we will call this function the natural logarithm function, denoted log(x),
although there we will use other means in order to define it on the interval (0, oc). In
particular, note that this series converges at x = 2 as well, giving us, with this definition
of log(x),


log(2) -     __-1)
             ni + 1


00 (-1)n+1
       n
n=1


Hence log(2) is the sum of the alternating harmonic series, a number for which we found
an approximation in Section 5.6.

Problems


1. Show directly that


          c )(-1)nc 2n
cos(x) = n (2n)!


for all x in (-oc, oc).


﻿


8


Taylor Series


Section 5.8


2. Using any method, find Taylor series representations about 0 for the following func-
  tions. State the interval on which the representation is valid. Also, write out the first
  five nonzero terms of each series.
  (a) cos(x2)                           (b) sin(2x)
        1                                     1
   (c)                                  (d)12t-i
      1-t22x-1
         1                                     1
   (e)                                   (f) 1+4x2
      (1 +t)21+42
              1 - cos(x)
  (g) f(x) =Xi
              0,          ifx =0
3. (a) Use the identity
                                  o2    1 + cos(2x)
                               cos2(w)      2
                                            2
      to find the Taylor series representation for cos2 (x) about 0. On what interval is
      this representation valid?
  (b) What is the Taylor polynomial of order 8 for cos2(x) at 0?
4. (a) Use Problem 3 and the identity
                               sin2(x) = 1 - cos2(x)
      to find the Taylor series representation for sin2 (x) about 0. On what interval is
      this representation valid?
  (b) What is the Taylor polynomial of order 8 for sin2 (x) at 0?
5. (a) Use the Taylor series representation about 0 for sin(x) to find the Taylor series
      representation for sin(x2) about 0. On what interval is this representation valid?
  (b) What is the Taylor polynomial of order 10 for sin(x2) at 0?
  (c) Find the Taylor series representation about 0 for

                                S(x)=J   sin(t2)dt.

      On what interval is this representation valid?
  (d) What is the Taylor polynomial of order 11 for S(x) at 0?
  (e) Approximate S(1) with an error of less than 0.00001.
6. Let Pt be the Taylor polynomial of order n at 0 for
                                         1
                                f(w) =+w2.
   Plot f, F2, F4, and P10 together over the interval [-1.5, 1.51. Why do the Taylor
   polynomials not give a good approximation to f(w) when wz > 1?

7. Find Si(w).
        d9     x=0


﻿


        ujjerence Equanons           Section 5.9
               to
       Differential Equations        Some Limit Calculations


In this section we will discuss the use of Taylor polynomials in computing certain types of
limits. Although this material could have been treated directly after Section 5.2, we have
saved it until now so as not to break into the development of Taylor series. To illustrate the
ideas of this section, we begin with two examples, the first of which is already well-known
to us.

Example Consider the problem of evaluating


                                     lim sin(x)


The reason this limit presents a problem is that, although the function in question is a
quotient of two continuous functions, both the numerator and the denominator approach
0 as x approaches 0. Now from our work on Taylor polynomials we know that

                                 sin(x) = x + o(x),


so


sin(x)
  x


But, by definition,


x + o(x)


lim o(x)
x->O x


1 + o(x)
     x


0.


Thus


lim sin(x)
x->O  x


lim x + o(x)
x->o   x


lim I
X-->0


+ o(z)
    X


Example The limit


lim1 - cos(x)
x->0    x2


presents the
know that


same type of problem. Using the Taylor polynomial of order 2 for cos(x), we


                  cos(x)= 1- - + o(x2).
                               2


1I


Copyright @ by Dan Sloughter 2000


﻿


2


2       ~Some Limit CalculationsSeto5.


Section 5.9


Hence


  1rn 1 - cos(cc)
x-O    Xc2


  1r - (1 -2 + 0(x 2))I
x-O          xc2

  xr 2 2
x-0  xc2


1


   The point in both of these examples was to use the fact that if f is nt + 1 times
continuously differentiable on an interval about the point c, then, as we saw in Section 5.2,


f W )= f (C) + f'(c)(cc - C) +  2! (C(


f(12)() (cc - c)Th + o((cc - c)Th). (5.9.1)


Hence if f and g are both ft + 1 times continuously differentiable on an

then


interval about c,


nt!


-C),h + o((cc


- On)


(5.9.2)


and


g~cc)  g())(cc - c)Th+ o((cc


(5.9.3)


Hence


(C) (Xc_-c)Th+o((cc


lim g(cc)
x---c f Wc


c) fl)


lim
x---c


i v


f(12) (c) ( x - c n + ((


c) fl)


   li2f! (cc - n

   g (12 ) (c) + 0( X -c n


f(n2) (c)
  nt!
g () (c)
f(n!c


(5.9.4)


﻿


Section 5.9


Some Limit Calculations


3


That is, under the specified conditions, the value of the limit is equal to the ratio of the
nth derivatives of g and f evaluated at c. In addition, if it were the case that, for some
k < n, g(2)(c) = 0 for i = 1, 2, ... , k - 1 and g(k)(c) # 0, then we would have

                              g (k) (c)
                     g(k) gk!)(cc)k+o((Xc- c)k)
                     f(xc)   f(12)(c)
                                rt!f(X-Xc)" + o((X-_c)")
                                    k+
                                g (k) (c) +o((X - c)k )
                             k!(x-c)-         (Xc-cc)"
                                f (n)(c) +o((X -c)")
                                  n! (X -c)n
                                  1_      g(k)(c)  o((x - c)k)
                             (XC - c)-k     k!       (X - c)k
                                     f (n)(c) o((X - c)n
                                       n2!     (X - c)n
Since the denominator of this last expression has a limit as x approaches c, but the numer-
                                        g(cc)
ator does not, it follows that in this case   would not have a limit as c approaches c.
                                        f(cc)
That is, in this case f(x) would approach 0 as x -- c at a rate faster than g(x), implying
that the limit of the ratio would not exist.
    In practice, we do not use the conclusions of the preceding paragraph, but rather apply
the procedure outlined. That is, to evaluate

                                         .g(cc)
                                      lim f(c)'

where both
                                     lim g(x) = 0

and
                                     lim f (x) = 0,

we replace both f and g by their respective Taylor polynomial expansions about c, ex-
panded to the first nonzero term, and evaluate the limit as illustrated above.
Example To find
                                     lim sin(2)
                                     x-~ cc tan(cc)'
we note that
                                         3     5    7
                              si~x =x 3!      5!   7!
implies
                                        6    c10   c14
                         sink2) =1c2    3!    5!    7!


﻿


                                                                         H                              H                                                                       H
             CD          CD                                                                                                          CD                                                           o

~                                                                                                       CD
                                                                         CL                                                                                                     CL                CD
             CD          CL                                                                                                          CL                                                           0
                                        -                                                                                                            -                                            CD
     CL
                                                                                                                                     0
     S.
                         CD                                                                                                          CD                                                           CL


                                        CD
   H                                                                                                                                                               ~CL
                                                        I     I                                                                                                    _                              CD
                         CL                                  CL                    I
                                        _               0    ~
                                                       CL,-. 0 I
                                                                                  CL                                                                                                              CL
                                        CD                                                                                                           CD           _

     +                   CD
                                                           H
                                                                                   II                                                                                                             CD
                                                                                                                                                                                                  CD

                                                                                                                                                                                                  H
                    CL                                                             I'                                                                                                    _        ~      CL
                                                                                                                                                                                         ~
   H I
                                                                                   I                                                                              ++                     ~        0
                    K I
                                                                                                                                                                                                  0
                                    -               ,N                                                                                            I
                     II             CL                     II                   1Qk~                                                                                                     H


                                                                                                                +              +

                                                                                                                                                                         I'

                                                                                                                                                                +        +

                                                      I        I               ~                   H


                                                    Q~J~
     I'                                                                                                                                                                                           CD


                                                           H
  I'                                                                                                                                                                                              CD
  +                                                    I       I                    N


  LQ?                                                 z        z                   .                                                                                 I'


﻿


Section 5.9


Some Limit Calculations


5


where the final equality follows after noting that


                                 lim 1 + o(x2)
                                 x-0 0        2 )


while
                                           1
                                      lim  2=   0.
                                      x-0 X2


1,


    The essence of (5.9.4) is also captured in the following statement, known as l'Hopital's
rule.

l'Hopital's rule If f and g are twice continuously differentiable on an interval about
the point c and both g(c) = 0 and f (c) = 0, then


lim g(x)
x- c f(x)W


. g'(x)
x-c f'(X)


(5.9.6)


    This is equivalent to our previous result, assuming the conditions specified at that
time, because repeated applications of l'H6pital's rule yield


lim g(x)
x- c f(x)W


lim g'(x)
x-ac f'(X)


   .g" (x)
lim
x-c f"(x)


.im (n)(X)


g(n) (c)
f () (C)


(5.9.7)


- 1 and


which is (5.9.4). As before, if for some k < n, g(i)(c)
g(k)(c) # 0, then


Ofori=O,1,2,...,k


                                      lim
                                          f(k)(X)

                   g(x)
does not exist and does not have a limit as x approaches c.
                   f(x)

Example We will illustrate l'H6pital's rule first with another well-known limit. Namely,


li   1 - cos(x)
x-nO     x


     d
   -dX(1cos(x))
lim

         do


lim sin(x)
x- 0   1


0.


﻿


6


Some Limit Calculations


Section 5.9


Example As an illustration of how it may be necessary to apply l'H6pital's rule more
than once, we have


lim sin(x) - x
x-O cc3


     d
     do (sin(x) - x)
lim
1- o      d
   x 0 3
          dx

lim (cos(x) - 1)
x-0     3x2


                                                d
                                           _   dcc(cos(cc) - 1)
                                         -lim d


                                         =lim - sin(cc)
                                         x- 0  6c
                                            1   - sin(x)
                                            x-O cc
                                            6 x-0 x
                                            1
                                            6

Note that this particular problem could have been done more quickly using the fact that


sin(x) - x


cc3
  6+ o(ca).


    Although we will not do so here, it is possible to demonstrate that l'H6pital's rule is
more widely applicable than what we have indicated so far. In particular, we may also
apply l'H6pital's rule to one-sided limits and to limits as x approaches o0 or x approaches
-oo, provided, of course, that both g(x) and f(x) are approaching 0 and are twice contin-
uously differentiable on the appropriate intervals. The following examples illustrate these
applications.

Example Using l'H6pital's rule, we have


lim sin(x)
hmir   c-


       d
  l m  dc sin(x)

      do
  li    cos(x)
  him
x-     2-Fr+ 1
       2 cc-w
 lim 2 cos(x) cc-7

0.


﻿


Section 5.9                          Some Limit Calculations

Example Using l'H6pital's rule, we have


7


lim x sin 1
          K-


          1
     sin -
li       1
         x
      d   .   1
         sm-

x-oo      d 1
         dz z
         1      1
         cos -
lmo         1
            12

 lim cos -

 1.


Notice we could have computed this limit


                                      71 N
                           lim x sin -
                           x-o        X


                    1
by substituting h =-,thus obtaining
                    x

     .   sin(h)
 =              -m = 1.
   h-0O+ h


    Finally, it is also possible to demonstrate that l'H6pital's rule applies when both the
numerator and the denominator are approaching oc. That is, if f and g are twice contin-
uously differentiable at c and both


                                     lim f (x) = o0
                                     x-C


and


lim g(x) = 00,
x--C


then


lim g(x)
x- c f(x)


   .g'(x)
lim .
x-C f'(X)


(5.9.8)


As before, this also applies for one-sided limits and for limits as x approaches 00 or -oo.
Moreover, one or both of g(x) and f(x) may be approaching -o.


﻿


8


Some Limit Calculations


Section 5.9


Example


Using l'H6pital's rule,


.  3x+1
h-m xo cc2+
  x-xdz+4


      d
 .r d(3x +1)
     d  x2+4
     doc

xlim     3
        x2+ 4


                                                .3 x2 + 4
                                            = hm
                                            x-oo      cc
                                                     x2 +4
                                          = lim 3
                                             x-oo         2

                                          = lim 3    1+    2

                                          = 3.

Of course, we could have also computed this limit by dividing both the numerator and
denominator by x to obtain


      3x+1
 lim  3cc+1
x-xo cc2+4


           1
       3+-
 lim       x
x       cc2 +4
         xc


           1
       3+-
 lim       x
x oo cx2+4
          x2


           1
       3+-
 lim       x
x-1oo       4
       F1+ c2


3.


Problems

1. Use Taylor polynomials to find the following limits.


(a)     sin(3x)
  ()lim
    x--O ox
                     cc2
        cos(x) - 1 + 2
(c) lim
    x-o x4
    . tan(u)
(e) him sn)
    u--o sin(u)
    . tan(3y)
    y- o tan(5y)


(i) lim     +    -
    x-O      3cc


        t - Sin(t)
(b) lim
    two     t2


        sinmc(x2)
(d) lim
    x-o sin12(

(f) lim sin(t) - t
    t-o     t3

(h) lim tan(x2)
    x-O sin(cx2)
                      t
           1+t-1--
 (j) lim
    two       3t2


﻿


Section 5.9                           Some Limit Calculations

2. Use l'H6pital's rule to evaluate the following limits.


9


(a) lim sin(5c)
    x-0 3x
       .1 -sex)
(c) lim      sec()
    x-o ox

(e) lim sin(2x)
    x- O+     cc


        1 - cos(3t)
(b) lim

(d) lim 1 - tan(t)
    t-i cos(2t)
    .  1 - cos(c)
(f) hm     ~     x
    x-0    sin(x)
          3x2
(h) lim
    x-01 sin2(X)


(g) lim x sin
    x-00


( 1
X2}


3. Evaluate the following limits using any method you prefer.


  (a)    3x2 - 2x + 1
(a) lim
    x-oo  16x2+2

         1 - sin(x)
(c) lim        x
    x-0     3x2
         (1 +  )i-1 -c
(e) lim                 3
    x-0 x2

(g) lim c-
       x-Uc-i
                      t2
        cos(t) - 1 + 2
(i) lim
    two t2 sin(t2)


(b) l    tan(x2)
  ()lim
    x-0 sin2 (X)


(d) lim    +)3-1
    x-O       x

         1 - cos(t)
(f) lim
    two t sin(2t )

(h) lim cos(x) + 1
    x-7r     c - 7r


 (j) lim
    u- 264(1 - cOS(u))


4. Let g(x)


x2 sin (1)

0,


if cc 0,

if x =0.


(a) Show that g'(0) = 0, and hence that g(x)


o(X).


(b) Use the preceding result and the fact that tan(x)


x + o(x) to show that


    x2 sin(-
lim          c1x
x-O tan(cc)


0.


(c) Letting f (x) = tan(x), show that


                                  lim  g(c) # fim '( >x)
                                  x-o f (X) x-o f'I(X)


Which condition in the statement of l'H6pital's rule does not hold for this example?


﻿


Section 6.1


               to                   The Exponential Function
       Differential Equations


At this point we have seen all the major concepts of calculus: derivatives, integrals, and
power series. For the rest of the book we will be concerned with how these ideas apply
in various circumstances. In particular, in this chapter we will introduce the remaining
elementary functions of calculus: the exponential function, the natural logarithm function,
the inverse trigonometric functions, and the hyperbolic trigonometric functions. As they
are introduced, we will discuss related issues involving derivatives, integrals, and power
series, as well as applications to the physical world.
   We will begin by considering the exponential function. We first saw this function in
Section 5.7, but we will redefine it here for completeness.
Definition The exponential function, with value at x denoted by exp(x), is defined by
                               0o  n           2    3
                     exp(x)                   2!l   3!-+-
                              e p )1+      x+       +                      (6.1.1)

   We saw in Section 5.7 that this series converges absolutely for all values of x; hence
the domain of the exponential function is (-oc, oc) . We should also note that exp(0) = 1.
    Using the properties of power series, it is an easy matter to compute the derivative of
the exponential function:

        d    )         - -    =         -    =            =     -0xn = exp(x).
     d exp(x)zz     lEN d                       ~   tl!
     dx          dx      n         dx   n!         (n - 1)!
                     n=0    n    = n0          n=1 (nn=0n

Proposition
                                  exp(x) = exp(x).                         (6.1.2)
                               dx

Example Using the chain rule, we have

                              d
                                exp(4x) = 4 exp(4x).
                              dx

Example Similarly,
                              d
                              dexp(x2) = 2x exp(x2).
                              dx
    In fact, the exponential function is the only function f for which both f(0) = 1 and
f'(x) = f(x) for all x. To see this, we first demonstrate a more general property. Suppose


1


Copyright @ by Dan Sloughter 2000


﻿


2                       The Exponential Function                       Section 6.1

f is any function for which f(0) = c and f'(x) = kf (x) for all x, where c and k are
constants. Then it follows that
                               d
                       f"(x) =d(kf (x)) = kf'(x) = k2f (x),
                               dx
                      f"'(x) J=(k2f (x)) = k2f'(x) =ka f (x),

and, in general,
                                 f(n)(x) - kf (x)                           (6.1.3)

for n = 0, 1, 2,.... Hence
                                   f(n)(0) = k mc                           (6.1.4)
for n = 0, 1, 2, .... Thus the Taylor series for f about 0 is given by

                    f(")(0)        ck            (kx)n
                      nz!0         cn x = c        n!_=_ cexp(kx),          (6.1.5)
                n=0             n=0          n=0

where the final equality follows from the definition of the exponential function. Now, as a
consequence of Taylor's theorem, if P is the nth order Taylor polynomial for f at 0, then

                                             M      1+
                          If (W - Pn (x)| < (n l!IX~n+1,                    (6.1.6)

where M is the maximum value of |f(n+l)| on the closed interval from 0 to x. But

                               f(+1±)(x) -=kn+f (X)

so
                                   M = |kIn+1 L
where L is the maximum value of |f l on the closed interval from 0 to x. Hence


                     If-(x)-PI(x)| < k        n+1    Lk=( l.(6.1.7)

As we have seen before,
                                     Lkzln+1
                                 lim Lkxh        0                          (6.1.8)
                                 n-oo (n + 1)! 0
for any value of x, so it follows that


                                    1f(-)(0)


for all x. In other words, f has a Taylor series representation, and so, using (6.1.5), we
have
                                  f (x) =c exp(x).


﻿


Section 6.1


The Exponential Function


3


Proposition    If f is a function for which f(0) = c and f'(x) = kf(x) for all x, where c
and k are constants, then
                                   f (x) = c exp(x)                           (6.1.10)

for all x.
    In particular, if we let c = 1 and k = 1 in this proposition, then f(x) = exp(x).
In many ways, it is this property that makes the exponential function one of the most
important functions in mathematics.
    Now consider a function f defined by f(x) = exp(x + b) for some constant b. Then

                              f'(x) = exp(x + b) = f (x),

so, by the previous proposition, we must have f (x) = c exp(x) for all x, where c = f(0)
exp(b). That is, for all values of x,

                          exp(x + b) = f (x) = exp(b) exp(x).

This demonstrates a fundamental algebraic property of the exponential function: For any
numbers a and b,
                              exp(a + b) = exp(a) exp(b).                     (6.1.11)

    It follows from (6.1.11) that for any number a,

                      exp(a) exp(-a) = exp(a - a) = exp(0) = 1.

That is,
                                               1
                                  exp(-a) =exp(a)                             (6.1.12)

More generally, using both (6.1.11) and (6.1.12), we have

                                                      exp(a)
                        exp(a - b) = exp(a) exp(-b)-=       .                 (6.1.13)
                                                      exp(b)                  (113

for any numbers a and b, another important algebraic property of the exponential function.

    We shall soon see that the number exp(1) plays a special role in this discussion.
Definition The value of the exponential function at 1 is denoted by e. That is,

                                               1   1
                          e =exp(1) =1 + 1+ - +-+.- -.                        (6.1.14)
                                               2 3!

    It may be shown, although not easily, that e is an irrational number. Much more easily
(see Problem 5), it may be shown that, to 5 decimal places, e is given by 2.71828. The use
of the letter e to denote this number originates with Leonhard Euler (1707-1783), one of
the most prolific mathematicians of all time.


﻿


4


The Exponential Function


Section 6.1


Notice that for any positive integer n,


exp(n) = exp(,1 + 1 + - + 1)
                 n times


exp(1) exp(1) -. -exp(1) = (exp(1))Th = e

       n times


(6.1.15)


and


              1
exp(-)  exp()


1
   -e
en


-n2


(6.1.16)


Combining this with exp(0) = 1, we have

                                     exp(n) = en

for all integers n. Moreover, for any integer n # 0,


(6.1.17)


(exp (-


exp (-/exp(-)-...-exp(2-)

           n times
      1   1         1
exp -n + -+"-""-+ -
        nn          n1

          n times
exp(1) = e,


showing that


1x(=      1
1ex p -  e-.
p n)


(6.1.18)


Hence if m and n are integers with n # 0, then


        m(        1   1        11
exp(   exp -+-+- +i)
       n           n      n  

                     m times


(exp               )m- e


m
e n .


(6.1.19)


The next proposition summarizes these facts.

Proposition For any rational number r,


exp(r) = er.


(6.1.20)


    That is, evaluating the exponential function at a rational number r is equivalent to
raising e to the rth power. A natural question at this point is to ask whether the same result
holds for irrational numbers. A little thought shows that this question is not meaningful;
although we know what it means to raise a number to a rational power (namely, for integers
m and n,
                                       -m     m


﻿


Section 6.1


The Exponential Function


5


that is, an is the nth root of the mth power of a), we have never defined what it means to
raise a number to an irrational power. For example, at this point we do not have a meaning
to associate with the symbol 2'. We will now take the first step toward remedying this
situation by defining es for an irrational number s.
Definition If s is an irrational number, then we define


es = exp(s).


(6.1.21)


With this definition we can now say that


exp(x) = ex


(6.1.22)


for any real number x. The properties of the exponential function stated in (6.1.11) and
(6.1.13) may be restated as


and


ex+y -exey


ex--Y =
        ey


(6.1.23)


(6.1.24)


for any real numbers x and y. Hence exponents behave in this new situation exactly the
way we should expect them to behave.
    From our previous result that

                                  d
                                  dexp(x) =exp(x),


it now follows that


d
   ex
do


ex.


(6.1.25)


From this differentiation rule we obtain the indefinite integral


I eXdx =e + c.


(6.1.26)


Example Using the chain rule, we have


d e2x
dx


2e2x.


Example Using the product and chain rules,


dj (3xe4X)


3x d (e4x) + C4x c/ (3x)

(3x) (8xe4x2) + (e4x2) (3)

(3 + 24x2)C4x2


﻿


6


The Exponential Function


Section 6.1


Example Since


d-4x
dx


A-4x


it follows that


e-4xdx


e- '+c.


    Notice the similarity between the evaluation of the integral in the last example and
the evaluation of the integral


cos(-4x)dx


- sin(-4x) + c.


In fact, just as, for any a # 0,


fcos(ax)dx = sin(ax) + c,


we have


I


         1
eaxd  =    ax + c.
        a


(6.1.27)


Example     To evaluate f 3xe2x2dx, we use the substitution


                                      u = 2x2
                                      du =4xdx.


      1
Then -du = xdx, so
      4


I x2x
   3xex dox


4 feudu


3         3 2
-e" + c= -e x+c.
4         4


Example     To evaluate f 2xe'dx, we use integration by parts with


                                 u= 2x     dv=exdx
                                 du=2dx     v=ex.


Then


2xexdx = 2xe


I 2e6dx =2xex - 2e + c.


    Notice the similarity between the technique for evaluating the integral in the last
example and the technique for evaluating f 2x sin(x)dx.


﻿


Section 6.1


The Exponential Function


7


Figure 6.1.1 Graph of y = ex


Example The integral f ex sin(x)dx may also be handled by integration by parts, al-
though with a little more work than in the previous example. Here we will let

                                u = sin(x)    dv= exdx
                              du = cos(x)dx  v = ex.

Then
                       f ex sin(x)dx = ex sin(x) - Jcos(x)dx.

We now perform another integration by parts by choosing

                                u = cos(x)     dv= exdx
                             du = - sin(x)dx    v = ex.

Then
                 f ex sin(x) =x sin(x) -   ex cos(x) + f ex sin(x)dz)

                            = ex sin(x) - e cos(x) - fe sin(x)d.

At first glance it may seem that we are back to where we started; however, all we need to
do now is solve for fe sin(x)d. That is, we have


              2 fJex sin(x)dx = ex sin(x) - ex cos(x) = ex(sin(x) - cos(x)),

so
                          e sin(x)dx = eX(sin(x) - cos(x)) + c.

Note that we have added an arbitrary constant c since we are seeking the general an-
tiderivative.


﻿


8                         The Exponential Function                          Section 6.1

                                          30

                                          25

                                          20

                                          15

                                          10

                                          5


                        -4        -2                  2        4
                             Figure 6.1.2 Graph of y = e-


    We now have sufficient information about the exponential function to understand the
geometry of its graph. Since e > 0 , we know that ex > 0 for all rational values of x, and
hence, by continuity, for all values of x. Since e > 1, it follows that

                                      lim ex = 00(6.1.28)
                                      X- 00
and
                                                       1
                           lim  ex = lim e' =lim -u= 0.                         (6.1.29)
                           X--00      u-00       U-e00n
Moreover, since
                                      dx      x
                                        ex=     > 0                             (6.1.30)
                                     do
and
                                     d2
                                        ex = ex > 0                             (6.1.31)
                                    dz2
for all x, the graph of y = ex is always increasing and always concave up. Moreover,
(6.1.30) and (6.1.31) indicate that as x increases, the graph is not only increasing, but its
slope is increasing at the same rate that y is increasing. Thus we should expect y to grow
at a very rapid rate, as we see in Figure 6.1.1. This rate of growth is characterized as
exponential growth. Figure 6.1.2 shows the graph of y = e-- , which is the graph of y = ex
reflected about the y-axis. In this case y decreases asymptotically toward 0 as x increases;
this is known as exponential decay
    We will close this section with an application to the problem of uninhibited population
growth, a problem we first considered in Section 1.4.

Uninhibited population growth
Recall from Section 1.4 that if xn represents the size of a population after nt units of time
and the population grows at a constant rate of a100% per unit of time, then the sequence
{ xn} must satisfy the linear difference equation


1n+1 - Xn = CeXn


(6.1.32)


﻿


Section 6.1


The Exponential Function


9


for n = 0, 1, 2, .... At that time we saw that the solution of this equation is given by

                                    on = (1 + a)"xo.                            (6.1.33)

The crucial aspect of (6.1.32) is the statement that amount of change in the size of the
population over any unit of time is proportional to the current size of the population.
Hence if x(t) represents the size of a population at time t, where the population can
change continuously over time, then the continuous time analog of (6.1.32) is the differential
equation
                                      c(t) = ax(t)                              (6.1.34)

for all time t. If xo is the size of the population at time t = 0, then we know from our
work in this section that the only solution to this equation is the function

                                     x(t) =o e"t .                              (6.1.35)

Hence if the size of a population is growing at a rate which is proportional to itself, an
assumption which, as we noted in Section 1.4, is often reasonable over short periods of
time, then the population will grow exponentially. As in Section 1.4, we refer to such
growth as uninhibited population growth.
Example In 1970 the population of the United States was 203.3 million and in 1980
the population was 226.5 million. Assuming an uninhibited growth model and letting x(t)
represent the population t years after 1970, by (6.1.35) we should have

                                    x(t) = 203.3eot

for some constant a. Since x(10) = 226.5, we can find a by solving

                                   226.5 = 203.3e10.

That is, we need to find a value for a such that

                                 e 1 " =226.5
                                         203.3

Unfortunately, solving this equation exactly requires being able to reverse the process of
applying the exponential function. In other words, we need an inverse for the exponential
function. We shall take up that problem in the next section; for now we may use a
numerical approximation. You should verify that a  0.0108 satisfies the equation. Thus
this model would predict the population of the United States t years after 1970 to be

                                  x(t) =203.3eo.0108t.

For example, this model would predict a 1990 population of


x(20) = 203.3e(o0108)(20) - 252.3


﻿


10


The Exponential Function


Section 6.1


                  1000

                  800

                  600

                  400

                  200


                     0     20    40     60    80    100   120   140
       Figure 6.1.3 Uninhibited growth model for the United States (1970-2120)


and a population in the year 2000 of

                           x(30) = 203.3e(o0108)(30) - 281.1.

While the prediction for 1990 is fairly accurate (the actual population was approximately
249.6 million), the second prediction differs significantly from the Census Bureau's own
prediction of a population of 268.3 million for the year 2000. As we discussed in Sections
1.4 and 1.5, an uninhibited growth model is a simple model which cannot be expected to
be accurate for predictions too far into the future.
    We shall have more to say about population models in Section 6.3, where we will also
see another example of a differential equation. We will have a much fuller discussion of
differential equations in Chapter 8.

Problems

1. Find the derivative of each of the following functions.

    (a) f (x) = 3e2x                           (b) g(t) = 4t263t
    (c) h(z) = (3z2 - 6)esz3                   (d) f (x) = e3x sin(2x)

    (e) g(x)-=                                 (f) h(t) = e-'t cos(4t)


       (g ()=e-2s + 2                          (h) g(O) =5Oe6O sin(20)
 2. Evaluate each of the following integrals.

    (a) f32xdz                                 (b) f4xe3x


    (c f4testdt                                (d) f5ye--dy


﻿


Section 6.1


The Exponential Function


11


(e) fz2ezdz

(g) fexcos(x)dx


(f) fx3e-2xdx

(h)    e-2t sin(3t)dt


3. Find the maximum value of f (x) = x2 -x on the interval (0, oo).
4. (a) Use the Taylor series for e-x to show that e-1 > 1. Hence conclude that e < 3.
   (b) Show that if Pn is the nth order Taylor polynomial for ex at 0, then
                                                 3X
                                le - Ps(x)| <           ~+
                                    Iex Pn(~l C(n + 1)! ~~

       for any value of x.
   (c) Use (b) to find an approximation for e with an error of less than 0.000005.
5. Find the following limits.


(a) lim xe-x
    x-o 00


        ex - 1
(b) lim
    x-0    x


            e-t-1
    (c) lim                                    (d) lim x 26-2x
        t-O    t                                   X-o 0
 6. (a) Show that lim xe- = 0 for any positive integer n.

    (b) Use (a) to show that if p is any polynomial, then lim p(x)e-x= 0.
        that ex grows faster as x - oc than any polynomial function.
 7. Graph the following functions on the specified intervals.
    (a) f (t) = 3e2t on [-3, 3]            (b) g(x) = 4x2-2x on [0, 5]

    (c) g(t)     - e-on [-10, 10]              (d) f(t) =e-' sin(3t) on [0,

 8. Evaluate the following improper integrals.

    (a) J   e-  dx                             (b) f   3e-2xd
         0                                          0
    (c) f   xe-xdx                             (d) f   3xe-2xdx
         0                                          0
    (e)     x2-xxdx                            (f) f     e-d

                                                       00
 9. Use the integral test to show that the infinite series e-n converges.
                                                      n=0
10. (a) Find the Taylor series for f(x) = e-2.
    (b) Use (a) to find the the Taylor series for

                                    erf(x)    2     e-dt,
                                              7oO


. This shows


10]


﻿


12


The Exponential Function


Section 6.1


       known as the error function.
    (c) Use your result in (b) to approximate erf(1) with an error less than 0.0001.
11. Suppose x(t) is the population of a certain country t years after 1985, x(0) = 23.4
    million, and
                                     'qt) = 0.008x(t).

    (a) What will the population of the country be in the year 2000?
    (b) In what year will the population be twice what it was in 1985?
12. Let
                                f(x)f=Xe      , if x#0
                                         0,      if x=0.

    (a) Graph f on the interval [-5, 5].
    (b) Show that f'(0) = 0.
    (c) Show that f()(0) = 0 for n = 0,1, 2, ....
    (d) Show that f is C° on (-oc, oc).
    (e) Note that the Taylor series for f about 0 converges for all x in (-oc, oo), but does
        not converge to f(x) except at 0. Thus f is C° on (-oc, oc), but not analytic at
        0.


﻿


Section 6.2


                to                    The Natural Logarithm          Function
       Differential Equations


In the last example of Section 6.1 we saw the need for solving an equation of the form

                                       ex = b.

for x in terms of b. In general, for a given function f, a function g defined on the range of
f is called the inverse of f if
                                     g(f (x)) = x                              (6.2.1)
for all x in the domain of f and
                                     f (g(x)) = x                              (6.2.2)
for all x in the domain of g. That is, if f(x) = y, then g(y) = x and if g(x) = y, then
f (y) = x. In order for a function f to have an inverse function g, for every point y in the
range of f there must exist a unique point x in the domain of f such that f(x) = y, in
which case g(y) = x. In other words, for any two points x1 and x2 in the domain of f, we
must have f(xi) / f(x2). Now this will be the case if f is increasing on its domain, since,
for such an f, x1 < x2 implies f(xi) < f(x2). In particular, a function f with domain
(a, b) will have an inverse if f'(x) > 0 for all x in (a, b). Hence, since

                                    d
                                       -zX6 >0
                                    dx

for all x in (-oc, oc), the function f(x) = ex must have an inverse defined for every point
in its range, namely, (0, oc). We call this inverse function the natural logarithm function.
Definition   The inverse of the exponential function is called the natural logarithm func-
tion. The value of the natural logarithm function at a point x is denoted log(x).
    Thus, by definition,
                                     log (e) = x                               (6.2.3)
and
                                     elg(x)    .                               (6.2.4)
Equivalently,
                           y = e if and only if log(y) = x.                    (6.2.5)
    Another common notation for log(x) is ln(x). In fact, most calculators use ln(x) for
the natural logarithm of x and log(x) for the base 10 logarithm of x. However, since the
natural logarithm function is the fundamental logarithm function of interest to us, we will
denote it by log(x) and often refer to it simply as the logarithm function.


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Natural Logarithm Function


Section 6.2


Example In the last example of Section 6.1 we needed to solve the equation

                                      eloa = 1.114.

Taking the natural logarithm of both sides of this equation gives us

                                 log(e10a) = log(1.114),

from which we obtain
                                    10a = log(1.114),

and hence
                                         log(1. 114)
                                             10
Using a calculator and rounding to 4 decimal places, we find that a = 0.0108.
    Being the inverse of the exponential function, the logarithm function has domain (0, oo)
(which is the range of the exponential function) and range (-oc, oo) (which is the domain
of the exponential function). Also, since e = 1 and el = e, it follows that log(1) = 0 and
log(e) = 1.
    Several basic algebraic properties of the logarithm function follow immediately from
the algebraic properties of the exponential function. For example, since, for any positive
numbers a and b,
                            6log(a)+log(b) - log(a)6log(b) - ab,

it follows, after taking the logarithm of both sides, that

                                log(ab) = log(a) + log(b).                        (6.2.6)

Similarly, since, for any positive numbers a and b,

                       6log(a)-log(b) _ log(a)e-log(b) - 6elg(a) a
                                                      elog(b)   b'

we have
                               log ()= log(a) - log(b).                           (6.2.7)

In particular,
                          log ( )1= log(1) - log(b) - - log(b)

for any b > 0. Finally, if a > 0 and b is a rational number, then

                                eblog(a) - (eC"~())  -ab


implies that


log(ab) = blog(a).


(6.2.8)


﻿


Section 6.2


The Natural Logarithm Function


3


    We have restricted b to rational values here because we have not defined ab for irrational
values of b, except in the single case when a = e. However, the expression

                                        6b log(a)


is defined for any value of b, rational or irrational. Hence the following definition provides
a natural extension to the meaning of raising a number a > 0 to a power.


Definition If a > 0 and b is an irrational number, then we define

                                     ab -   beblog(a)


(6.2.9)


With this definition, we have


log(ab) = b log(a).


(6.2.10)


for a > 0 and any value of b.

Example We now see that


27r - eirlog (2)


which, using a calculator, is 8.8250 to 4 decimal places.

    The derivative of the logarithm function may be found using our knowledge of the
derivative of the exponential function. Specifically, if y = log(x), then ey= x. Thus,
differentiating both sides of this expression with respect to x,


d
dx


d
   x
do 1


=1.


from which we obtain


dy
dx


Hence
                                     dy     1    1
                                     dx6eY     -X.
Since we started with y = log(x), this gives us the following proposition.

Proposition
                                     d           1
                                     dx log(x) =-.

Example Combining the chain rule with the previous proposition, we have


(6.2.11)


J log(x2 + 1)


(1
        (2x)
x2+1


  2x
x2+1


﻿


4


The Natural Logarithm Function


Section 6.2


Example     It is worth noting that, in general, for any differentiable function f,


d
dlog(f(w))


(1_ \
Kf))f'(w)


f'(x)


(6.2.12)


Thus, for another example,


d log(3x4 + 8)
dcc


12x3
3x4 +8


Example In some circumstances it is useful to use the properties of the logarithm func-
tion before attempting to differentiate. For example,


l og (3x   4x2 + 2)


d  og(3x) + log(4x2 + 2)
dx


                                 d
                                 dlog(3) + log(x)
                               -dxw
                                 1   1   8x
                                 x 2 4w2 +2
                                 1     2x
                                 x   2x2 + 1
Turning to integrals, we note that
                                 d    g     1
                                   log(w) =-
                                dw          w


  1
+ -log(4wx2 +2))
  2


implies that
                                   - d =log(x) + c,

provided x is in the domain of the logarithm function, that is, w > 0. For x < 0, we have


d
dwlogx


d
d log(-X)
dx


1 (-1)
-xw


1
w


showing that
                                  J-' d =log I + c

for x < 0. Since x = x when x > 0, we can combine the above results into one statement.
Proposition


/ 1
1I dw = logwI+ c.
Jx


(6.2.13)


Example


To evaluate f   2 +1 dw we make the substitution

                          -du = 2xdx.


﻿


Section 6.2


The Natural Logarithm Function


5


      1
Thus - du = cdx, so
      2


       dx
x2+1  


21 1


1
- log |al + c
2


1
- log(x2 + 1) + c,


where we have removed the absolute value sign since x2 + 1 > 0 for all x.


Example To evaluate


I tan(x)dx


f sin(x) dx,
  cos(x)


we make the substitution


U = cos(x)
du = - sin(x)dx


Thus -du


sin(x)dx, so


  J tan(x)dx


I du
ui


log |ul + c


log | cos(x)| + c.


Example To evaluate f sec(x)dx, we first multiply sec(x) by

                                    sec(x) + tan(x)
                                    sec(x) + tan(x)


to obtain


J sec(x)dx


f sec2(x) + sec(x) tan(x) dx.
I     sec(x) + tan(x)


Then we make the substitution

                            u = sec(x) + tan(x)
                            du = (sec(x) tan(x) + sec2(x))dx.


Hence


J sec(x)dx


du =log |ul + c = log |Isec(x) + tan(x)| + c.


Note that this example does not illustrate a general technique for evaluating integrals, but
rather a nice trick that works in this specific case. In fact, it is just as easy to remember
the value of the integral as it is to remember the trick that was used to find it.

Example We may evaluate f log(x)dx using integration by parts. To do so, we choose


u = log(x) dv
      1
du= - dz v
      xc


dx


z.


﻿


6


The Natural Logarithm Function


Section 6.2


4


2


10


20


30        40         50


-2


-4


Figure 6.2.1 Graph of y = log(x)


Then
                    f log(x)dz = x log (x) - fdo = x log(x) - x + c.

    We can now put together enough information to obtain a geometric understanding of
the graph of y = log(x). Since
                                    d           1
                                       log(x) =- > 0

for all x in (0, oc), the graph of y = log(x) is increasing on (0, oc). Note however that
the slope of the graph decreases toward 0 as x increases; although the graph is always
increasing, the rate of increase diminishes as x increases. This is also seen in the fact that


                                  dc2 log(X)  c=c- < 0
for all x> 0. As a consequence, the graph is concave down on (0, oo). Since the logarithm
function is the inverse of the exponential function, it follows from
                                       lim ex = o0
                                       x-oo
that
                                     lim log(x) = 0o                             (6.2.14)
                                     x-oo
and from
                                       lim ex = 0
                                       x--oo
that
                                    lim log(x)   -oo.                            (6.2.15)
                                    x-o+
From (6.2.14) we see that, even though the slope of y = log(x) decreases toward 0 as x
increases, y will continue to grow without any bound. From (6.2.15) we see that the y-axis
is a vertical asymptote for the graph. Using this geometric information, we can understand
why the graph of y = log(x) looks like it does in Figure 6.2.1. You should compare this
graph with the graph of y = ex in Figure 6.1.1


﻿


Section 6.2                      The Natural Logarithm Function                        7

    We will use the relationship

                                    log(x)   f}dt                                (6.2.16)

to find the Taylor series representation for the logarithm function. Since, as we saw in
Section 5.8,


1        1
t    1-(1-t)


1


t = O


for 0 < t < 2, it follows that

                 log(x)         dt


                             (-1)      (t - 1)"dt
                          n0
                          (-1)T+(x - 1)n+


                          - O (- 1) '+(x - 1)
                          n=1

                          (       -(x   1)2    (x-    3
                                       2          3
for 0 < x < 2. Hence


(x -1)4
   (  ) +  ...
   4


           1(-1)n+1 (X - 1)"
log(X) (=
                    n
         n=1


(x - 1)2   (x - 1)3
   2     +     3


(x-1)4
   4


is the Taylor series representation of log(x) on (0, 2). Notice that at x = 0, this series
becomes a multiple of the harmonic series, and so does not converge, while at x = 2, it is
the alternating harmonic series, which does converge. Thus we would suspect that


log(2)=
         = -
         1


2   3    4


This is in fact true, and may be verified using Taylor's theorem (see Problem 6).
    We will end this section by extending an old result. In Chapter 3 we saw that for any
rational number n # 0,
                                      d  n      1n-1
                                      do
Now that we have defined x" for irrational n (provided x > 0), we see that


                     do n
                     dcc
for any real number nt54 0.


dz  zlog(x) -zenlog(x)-__ = -
dcc           cc           c


﻿


8


The Natural Logarithm Function


Section 6.2


Proposition    For any real number n # 0,


d x  12 n   -1
dx


d x =n -1
dxX  W


(6.2.17)


Example Note that


while
                     drx =dexlog(w) =-log( r)exog() - log(wr)WrX.
                     dx     dx

    In Section 6.3 we will consider some applications of the exponential and logarithm
functions.

Problems

1. Let a = log(2) and b = log(3). Find the following in terms of a and b.
    (a) log(6)                                (b) log(1.5)
    (c) log(9)                                (d) log(12)
 2. Find the derivative of each of the following functions.
    (a) f (x) = log(3x2)                      (b) g(t) = t3 log(3t + 4)

    (c) g(x) =log (4x2v/2 + 5)                (d) h(t) =log (13t2 + 1

    (e) f (x) = e2x log(5x)                   (f) g(z) = 3z log(4z + 5)
    (g) h(x) = log(log(x))                    (h) f (x) = 2x
                                                              4t2 + 3
    (i) f       e(x) =e(j) f(t) =_log                          t+1

 3. Evaluate each of the following integrals.

    (a)       dcc                             (b) f     2   dcc


    (c)   3x2 + 1 dx                          (d)   4x3 + 15 dx

    (e) ftan(3x)d                             (f) fcot(x)d


    (g)   csc(x)d                             (h)i       :    dx

 4. Evaluate each of the following integrals.

    (a) flog(3x)dz                            (b) fxlog(x)d


﻿


Section 6.2


The Natural Logarithm Function


9


    (c) Jlox) dx                              (d) f3x2log(x)dx

    (e) flog(x+l)dx                           (f) f(x2+3)log(x)dx

    (g) Jlg        dx                         (h) f2xdx
          x log (X)
 5. (a) Show that
                                       lim log(x)=0.

       What does this say about the rate of growth of log(x) as x increases?
    (b) Show that for any real number p > 0,

                                       lim log(x)=0.

       What does this say about the rate of growth of log(x) as x increases?
 6. (a) Use Taylor's theorem to show that the alternating harmonic series converges to
       log(2). That is, if rn(x) is the error in approximating log(x) by the nth order
       Taylor polynomial at x, show that lim rn (2) = 0.
    (b) Use the Taylor series for log(x) about 1 to approximate log(2) with an error of less
       than 0.005.
    (c) Use the Taylor series for log(x) about 1 to approximate log(1.5) with an error of
       less than 0.001.
 7. Graph each of the following on the given interval.
    (a) y = log(3x) on (0, 20]                (b) y = log x| on [-20, 20]
            log(t)
    (c) x =    tn (0, 10]                     (d) x = t2 log(t) on (0, 3]
    (e) y = sin(log(O)) on (0, 2]          (f) y = log(x2) on (0, 50]

 8. Compare the graphs of y = 2x and y  ().


 9. Use the integral test to show that the harmonic series  diverges.
                                                       n=1
10. (a) Find lim log(log(x)).
    (b) Show that
                                     .i log (log(x))
                                        x-olog(x)

       What does this say about the rate of growth of log(log(x)) as x increases?
    (c) Graph y =log(log(x)).


﻿


10                     The Natural Logarithm Function                        Section 6.2

    (d) Find the value of x such that log(x) = 20.
    (e) Find the value of x such that log(log(x)) = 20.
11. Find the length of the curve y = log(x) over the interval [1, 10].
12. Suppose x is a function with i(t) = ax(t), x(0) = 100, and x(5) = 200. Find x(t).
13. Given that g is the inverse function of f, show that

                                                 1
                                      g'(x) =P.

                                 dh           1
    Use this result to show that log(x)) =-.


﻿


Section 6.3


               to                    Models of Growth and Decay
       Differential Equations


In this section we will look at several applications of the exponential and logarithm func-
tions to problems involving growth and decay, including compound interest, radioactive
decay, and population growth.

Compound interest
Suppose a principal of P dollars is deposited in a bank which pays 100i% interest com-
pounded n times a year. That is, each year is divided into n units and after each unit of
time the bank pays 100i% interest on all money currently in the account, including money
that was earned as interest at an earlier time. Thus if Xm represents the amount of money
in the account after m units of time, Xm must satisfy the difference equation

                                Xm+1 - Xm = -om,                             (6.3.1)
                                              n
m = 0, 1, 2, ..., with initial condition x0 = P. Hence the sequence {Xm} satisfies the linear
difference equation

                                Xm+1    (1 + - xm,                           (6.3.2)

and so, from our work in Section 1.4, we know that

                         om =   1 +-     xo     1 + -    P                   (6.3.3)
                                    n               n

for m = 0, 1, 2, .... If we let A(t) be the amount in the account after t years, then, since
there are nt compounding periods in t years,
                                                  nt
                             A(t) = xnt =  1 + -    P                        (6.3.4)
                                               n

Example Suppose $1,000 is deposited at 5% interest which is compounded quarterly. If
A(t) is the amount in the account after t years, then, for example,

                        A(5) = 1000 1 + 0.05)20= 1, 282.04,
                                          4
rounded to the nearest cent. If the interest were compounded monthly instead, then we
would have
                        A(5) = 1000 1 + 0.0560= 1, 283.36.
                                          12


1


Copyright @ by Dan Sloughter 2000


﻿


2                      Models of Growth and Decay                         Section 6.3

    Of course, the more frequent the compounding, the faster the amount in the account
will grow. At the same time, there is no limit to how often the bank could compound.
However, is there some limit to how fast the account can grow? That is, for a fixed value
of t, is A(t) bounded as n grows? To answer this question, we need to consider

                                                nt
                                   lim 1+ -.
                                   nm-oo      Ln

To evaluate this limit, first consider the limit


                                   lim(1+-),
                                   x-o'o     x

where k is a constant. If we let


then
                                                  k
                               log(y) = x log 1 + -.

Using l'Hopital's rule, we have

                                                       k
                           lim log(y) = lim x log 1 + -
                           x-oo         x-oo           x
                                         .   log(i + k)
                                       =lhm
                                       x-oo       -

                                             .  log(1+   )
                                        lim

                                              14k( -b)
                                              =11
                                       =lim
                                       x-oo1+

                                         lim   kk
                                                 x
                                      =k.

It now follows that
                               lim y= lim elog") -   k
                               x-oo x-oo
That is, we have the following proposition.

Proposition For any constant k,


                                 lim   1i++'-   =e.(6.3.5)
                                 x-oox


﻿


Section 6.3


Models of Growth and Decay


3


                12000

                10000

                8000

                6000

                4000

                2000

                             10       20       30       40       50
  Figure 6.3.1 Compounding quarterly versus compounding continuously (5% interest)


    It now follows that
                               nt              . n  t
                  lim  1 +           lim   1 +       = (e) =   it.          (6.3.6)
                  n--o     n        n--o      n

Hence
                                                  nt
                        lim A(t) = lim P   1+ -     = Pet.
                        n--> o     n--> o      n
Thus no matter how many times interest is compounded per year, the amount after t years
will never exceed Pe". We think of Pet as the amount that would be in the account if
interest were compounded continuously.
Example In the previous example, with P = $1, 000 and i = 0.05, the amount after five
years of interest compounded continuously would be

                              1000e(o.05)(5) = 1, 284.03.

In other words, assuming a 5% interest rate, no matter how many times per year the
bank compounds the interest, the amount in the account after five years can never exceed
$1,284.03. As Figure 6.3.1 shows, in this case there is only a slight difference between
compounding quarterly and compounding continuously over a period of 50 years.

Growth and decay
We saw in Chapter 1 that the linear difference equation

                                 on+1 - on = aXn                            (6.3.7)

may be used as as simple model for the growth of a population when a > 0 or as a model
for radioactive decay when a < 0. As we discussed in Section 6.1, the continuous time
version of this model is the differential equation


i(t) = ax(t).


(6.3.8)


﻿


4


Models of Growth and Decay


Section 6.3


At that time we saw that the solution of this equation is given by

                                     x(t) = zoe',                               (6.3.9)

where zo = x(0). As before, when a > 0 this is a model for uninhibited, also called natural,
population growth, while when a < 0 it is a model for radioactive decay. More generally,
this model is applicable whenever a quantity is known to change at a rate proportional to
itself, as expressed by (6.3.8).

Example Suppose the population of a certain country was 23 million in 1990 and 27
million in 1995. Assuming an uninhibited population growth model, if x(t) represents the
size of the population, in millions, t years after 1990, then

                                     x(t) = 23e't

for some value of a. To find a, we note that

                                  27 = x(5) = 23e5.

Hence
                                       5a    27
                                             23'
from which we obtain
                                   5a=log (       .
                                             (23)
Thus
                                   1      27
                              a = - log=23) =0.0321,

where we have rounded to four decimal places. Hence

                                   X (t) = 23eo.0321t

For example, this model would predict a population in 2000 of

                         x(10) = 23e(o.0321)(10) = 31.7 million.

Also, assuming this model continues to be valid, we could compute how many years it
would take for the population to reach any given size. For example, if T is the number of
years until the population doubles, then we would have

                                46 = x(T) =23eo.0321T

Thus
                                      o 0321T_


﻿


Section 6.3


Models of Growth and Decay


5


so
                                   0.0321T = log(2)

and
                                       log(2)
                                  T=            21.6
                                      0.0321
to one decimal place. Hence a population growing at this rate will double in size in less
than 22 years.
Example A common method for dating fossilized remains of animal and plant life is to
compare the amount of carbon-14 to the amount of carbon-12 in the fossil. For example, the
bones of a living animal contain approximately equal amounts of these two elements, but
after death the carbon-14 begins to decay, whereas the carbon-12, not being radioactive,
remains at a constant level. Hence it is possible to determine the age of the fossil from the
amount of carbon-14 that remains. In particular, if x(t) is the amount of carbon-14 in the
fossil t years after the animal died, then

                                     c(t) = ax(t)

for some constant a, and so
                                     x(t) = zoe',

where x0 is the initial amount of carbon-14. Since it is known that the half-life of carbon-14
is 5,730 years (that is, one-half of any initial amount of carbon-14 will decay over a period
of 5,730 years), we can find the value of a. Namely, we know that

                                -1x(5730)
                              -zo = x(5730) = zoe570
                              2
so
                                      5730a  1
                                              2
Hence
                                  5730a    -log(2),

so
                                       a   log(2)
                                           5730
For example, suppose a fossilized bone is found which has 10% of its original carbon-14.
If T is the time since the death of the animal, we must have


                                 1lo  = x(T) = oe"T

Thus
                                       eaT =1

so
                                    aT =- log(10)


﻿


6


Models of Growth and Decay


Section 6.3


and
                           log(10)        log(10)
                     T--5730                     = 19, 035 years,
                              a           log(2)
rounding to the nearest year. Hence the fossil is from an animal that died more than
19,000 years ago.

Inhibited growth models
In Section 1.5 we discussed a modification of the uninhibited growth model which took into
account the limits placed on growth by environmental factors. In this model, which we
called the inhibited growth model, if xn is the size of the population after n units of time,
a is the natural growth rate of the population (that is, the rate of growth the population
would experience if it were not for the limiting factors), and M is the maximum population
which is sustainable in the given environment, then

                                                 M- o
                             on+1 -    = ax. (Mj)(6.3.10)


for n = 0, 1, 2,.... Hence this model modifies the natural rate of growth by the factor


                                                                              (6.3.11)
                                         M

representing the proportion of room which is left for future growth. As a result, when xn
is small, (6.3.11) is close to 1 and the population grows at a rate close to its natural rate;
however, as xn increases toward M, (6.3.11) decreases, causing the rate of the growth of
the population to decrease toward 0.
    Now (6.3.10) says that the amount of increase in the population during one unit of
time is jointly proportional to the size of the population and the proportion of room left
for growth. Thus for a continuous time model, if x(t) is the size of the population at time
                                                                           M -x(t)
t, then the rate of change of x(t) should be jointly proportional to x(t) and  .
                                                                              M
That is, x(t) should satisfy the differential equation

                                  M - x(t)      a
                     (t) = ax(t) (MM      t)      x(t)(M - x(t)).             (6.3.12)


This equation is called the logistic differential equation. It has many applications other
than population growth; for example, it is frequently used as a model for the spread of
an infectious disease, where cc(t) represents the number of people who have contracted the
disease by time t, M is the total size of the population that could potentially be infected,
and a is a parameter controlling the rate at which the disease spreads.
    To solve the logistic differential equation, we begin by rewriting (6.3.12) as

                                     ±(t)         a
                                x(t)(M -x(t))                                 (63M3


﻿


Section 6.3


Models of Growth and Decay


7


Since this is an equation involving the derivative of the function we are trying to find, we
might try integrating as a step toward finding x(t). That is, if we replace t by s in (6.3.13)
and then integrate from 0 to t, we obtain


/       i (s)
o x(s)(M - x(s))


I a
   ads
0 M


a
   t.
M


(6.3.14)


To evaluate the remaining integral in (6.3.14), we first make the substitution

                                       = x(s)
                                      du = (s)ds.

Then, letting zo = x(0), which we assume to be less than M, we have


     t   h (s)
                   ds
Jo x (s)(M - x (s))


              du.
x(t)  1


(6.3.15)


To evaluate this integral, we use the algebraic fact, known as partial fraction decomposition,
that there exist constants A and B such that


    1
u(M - u)


A      B
u M-u


(6.3.16)


Once we find the values for A and B, the integration will follow easily. Now (6.3.16) implies
that


    1
u(M - u)


A(M - u) + Bu
   u(M - u)


Since two rational functions with equal denominators are equal only if their numerators
are also equal, it follows that
                                 1 = A(M - u) + Bu.

This final equality must hold for all values of u, so, in particular, when u = 0 we obtain

                                       1 = AM


and when u = M we have


It follows that


and


1 =BM.
A = 1

     M

     1
B = A.
     M

   1 1
   Mui


Hence


    1
u(M - u)


1     1
MM-u'


﻿


8


Models of Growth and Decay


Section 6.3


and so


I x(t)
xo


    1
          du
u(M -u)


f x(t)    i   i     x(t)  1
          du +                Idu
M   xo  u       M  xo   M- u


(Jr log(u)


M log(M


     x(t)
u)
     xO


                x(t)
 1    (fiN)
M logKM-f

1     (f x(t) ~
M log M-x(t)}
1 log  X(t)
Mg       M - x(t)


M log(MXo
(M -o))


Here we have used the fact that x(t) > 0 for all t and the assumption that we are working
with values of t for which x(t) < M to avoid the need for absolute values. We will see
below that in fact the latter assumption holds for all t. Combining with (6.3.14), we have

                      1 _X((t)            M-x0o\        _a
                        log                           =   t.
                     M    d t    - x(t)   e exon        M

Multiply both sides by M and the applying the exponential function gives us


M(t) } )M- o
    M  - xt)  z


at
.


Letting


we have


M - zo
Mx0


(6.3.17)


Hence


Ox(t) = e't(M - x(t)).


/x(t) + x(t)eat Meat,


(/3 + eat)x(t) = Mect.


so


This gives us


or, after dividing through by eat,


       Meat
x(t) /=+6t
       #+ eat '


         M
     1 + #e6-a


(6.3.18)


﻿


Section 6.3


Models of Growth and Decay


9


Note that since 1 +#e-at > 1 for all t, we have, as we assumed above, x(t) <M for all t.
If we substitute back in the value for 0, we have, finally,

                              x(t)         x0M(6.3.19)
                                    xo + (M -                                 (o)e-631

Note that
                                      __  oM______       oxM_
                   lim x(t) = lim        X0M          -       -=M,            (6.3.20)
                   too        tco zo + (M - zo)e-at       o0
showing that the population, although never exceeding M, will nevertheless approach M
asymptotically.
Example The population of the United States was 179.3 million in 1960, 203.3 million
in 1970, and 226.5 million in 1980. Let x(t) represent the population, in millions, of the
United States t years after 1960. To fit the logistic model to this data, we need to find
constants a and M so that

                                          179.3M
                                  179.3 + (M - 179.3)e-at

for t = 10 and t = 20 (note that we already have x(0) = 179.3). That is, we need to solve
the equations
                                               179.3M
                      203.3 =x(10)1
                                      179.3 + (M - 179.3)e-Oa
                                               179.3M
                      226.5 =x(20)1
                                      179.3 + (M - 179.3)e-20a
for a and M. Working with the first equation, we have

                  (203.3)(179.3) + (203.3)(M - 179.3)e-100= 179.3M,

which gives us
                     (203.3)(M - 179.3)e-10" = 179.3(M - 203.3).
Thus
                                      179.3(M - 203.3)
                                      203.3(M - 179.3)                        (6.3.21)
Similarly, the second equation gives us

                               -20a   179.3(M - 226.5)
                                      226.5(M - 179.3)~

Now
                                      e2a (6-10a)2

so we have
                      179.3(M - 226.5) (179.3(M - 203.3) >2
                      226.5(M - 179.3) \203.3(M - 179.3)/I


﻿


10


Models of Growth and Decay


Section 6.3


                  400
                  350
                  300
                  250
                  200
                  150
                  100
                  50

                     0    20    40    60     80   100   120   140
         Figure 6.3.2 Inhibited growth model for the United States (1960-2110)

Thus
             (203.3)2(M - 226.5)(M - 179.3) = (179.3)(226.5)(M - 203.3)2,
which, when expanded, gives us

  (203.3)2(M2 - 405.8M + (179.3)(226.5)) = (179.3)(226.5)(M2 - 406.6M + (203.3)2).

Hence
                            719.44M2 - 259, 459.59M = 0.
Since M # 0, the desired solution must be
                                    259, 459.59
                                 M =           =___ 360.6
                                      719.44          '
rounded to the first decimal place. Substituting this value for M into (6.3.21), we have

                             -10a 179.3(360.6 - 203.3)
                                     203.3(360.6 - 179.3)'
and so
                           1      179.3(360.6 - 203.3)
                    a =      log                       = 0.02676,
                          10      203.3(360.6 - 179.3)1
rounded to five decimal places. Thus we have

           X(t)   179.3   (179.3)(360.6)                  64, 655.6
                  179.3 + (360.6 - 179.3)e-o02676t  179.3 + 181.3e-0.02676t'
For example, this model would predict a population in 1990 of
                                   64, 655.6
                  cc(30) =179.3 + 181.3e-(o02676)(30) = 248.2 million
and a population in 2000 of

                                   64, 655.6


X(40)


179.3 + 181.3e -(oo02676)(40)


267.8 million.


﻿


Section 6.3


Models of Growth and Decay


11


    The 1990 prediction is very close to the actual population in 1990, which was approx-
imately 249.6 million, and the prediction for the year 2000 is very close to the Census
Bureau's prediction of 268.3 million. Recall that the uninhibited growth model for the
United States, based on population data for 1970 and 1980, predicted a population of
281.1 million for the year 2000. To see how different the two models are, you should com-
pare the graph for the uninhibited growth model, shown in Figure 6.1.3, with the graph
of the inhibited growth model, shown in Figure 6.3.2.

Problems

1. Evaluate the following limits.
                  1 n                                         5 n
    (a) lim   1 + -                            (b) lim    1 + -

    (c) lim (1 -                               (d) lim (1-i-
       n oo       n                                n oo       n
                    2                                         4    1I_
    (e) lim   1 +                              (f) lim (1-- +2
       n o        n2                               n o       n     2
 2. Suppose $1500 is deposited in a bank account paying 5.5% interest. Find the amount
    in the account after 5 years if the interest is compounded (a) quarterly, (b) monthly,
    (c) weekly, (d) daily, and (e) continuously.
 3. Suppose $4500 is deposited in a bank account paying 6.25% interest. Find the amount
    in the account after 7 years if the interest is compounded (a) quarterly, (b) monthly,
    (c) weekly, (d) daily, and (e) continuously.
 4. A customer deposits P dollars in a bank account. Which is more advantageous to the
    bank customer: 5% interest compounded continuously, 5.25% interest compounded
    monthly, or 5.5% interest compounded quarterly?
 5. Let A(x) be amount in a bank account after one year if $1000 is deposited at 5%
    interest compounded x times per year.
    (a) Plot A(x) on the interval [1, 100].
    (b) Show that A(x) is an increasing function on (1, oc).
 6. A bone fossil is determined to have 5% of its original carbon-14 remaining. How old
    is the fossil?
 7. Suppose an analysis of a bone fossil shows that it has between 4% and 6% of its original
    carbon-14. Find upper and lower bounds for the age of the fossil.
 8. Carbon-l1 has a half-life of 20 minutes. Given an initial amount xo, find cc(t), the
    amount of carbon-li remaining after t minutes. How long will it take before there is
    only 10% left? How long until only 5% remains?
 9. Plutonium-239, the fuel for nuclear reactors, has a half-life of 24,000 years. Given an
    initial amount xo, find x(t), the amount of plutonium-239 remaining after t years. How


﻿


12


Models of Growth and Decay


Section 6.3


    many years will it take before there is only 10% left? How many years until only 5%
    remains?
10. If 1% of a certain radioactive element decays in one year, what is the half-life of the
    element?
11. (a) In 1960 the population of the United States was 179.3 million and in 1970 it was
        203.3 million. If y(t) represents the size of the population of the United States t
        years after 1960, find an expression for y(t) using an uninhibited growth model.
    (b) Use y(t) from part (a) to predict the population of the United States in 1980, 1990,
        and 2000. How accurate are these predictions?
    (c) Let x(t) be the population of the United States t years after 1960 as given by the
        inhibited growth model used in the last example in the section. Compare y(t) to
        x(t) by graphing them together over the interval [0, 200].
12. The population of the United States was 3,929,214 in 1790, 5,308,483 in 1800, and
    7,239,881 in 1810.
    (a) Let y(t) be the population of the United States t years after 1790 as predicted by
        an uninhibited growth model using the data from 1790 and 1800. Graph y(t) over
        the interval [0, 100] and find the predicted population for 1810, 1820, 1840, 1870,
        1900, and 1990. How accurate are these predictions?
    (b) Let x(t) be the population of the United States t years after 1790 as predicted by
        an inhibited growth model using the data from 1790, 1800, and 1810. Graph x(t)
        over the interval [0, 200] and find the predicted population for 1820, 1840, 1870,
        1900, and 1990. How accurate are these predictions? How do they compare with
        your results in a part (a)? What does this model predict for the eventual limiting
        population of the United States?
13. The population of the United States was 75,994,575 in 1900, 91,972,266 in 1910, and
    105,710,620 in 1920.
    (a) Let y(t) be the population of the United States t years after 1900 as predicted by
        an uninhibited growth model using the data from 1900 and 1910. Graph y(t) over
        the interval [0, 100] and find the predicted population for 1920, 1930, 1950, 1970,
        1990, and 2000. How accurate are these predictions? Using this model, in what
        year will the population be twice what it was in 1900?
    (b) Let x(t) be the population of the United States t years after 1900 as predicted
        by an inhibited growth model using the data from 1900, 1910, and 1920. Graph
        x(t) over the interval [0, 200] and find the predicted population for 1930, 1950,
        1970, 1990, and 2000. How accurate are these predictions? How do they compare
        with your results in a part (a)? What does this model predict for the eventual
        limiting population of the United States? Using this model, in what year will the
        population be twice what it was in 1900?
14. Show that the graph of a solution to the logistic differential equation

                                 ± (t)  jyc(t)(M - xt)


﻿


Section 6.3                    Models of Growth and Decay                     13

   with 0 < x(0) <M, is concave up when

                                            M
                                     x(t) < 2

   and concave down when
                                            M
                                     x(t) > 2
                                            2Wt
   What does this say about the rate of growth of the population?


﻿


Section 6.4


               to                    Integration of Rational Functions
       Differential Equations


In this section we will take a more detailed look at the use of partial fraction decomposi-
tions in evaluating integrals of rational functions, a technique we first encountered in the
inhibited growth model example in the previous section. However, we will not be able to
complete the story until after the introduction of the inverse tangent function in Section
6.5.
    We begin with a few examples to illustrate how some integration problems involving
rational functions may be simplified either by a long division or by a simple substitution.


Example
obtain


To evaluate


f x   dx, we first perform a long division of x + 1 into x2 to
J 2+


x*21
x+1              +


Then


Ix2
  x+1 2

To evaluate


I ( 1+    1    dx
        a'+1/


-z2 -xo+log xlo ++c.
2


Example


/2x + 1 dx, we make the substitution

           u =(2x
           du = (2x + 1)dx.


Then


/2x + 1
   x2 +x


[1
- du = log u + c =
Ju


log x2 + x + c.


Example


To evaluate


  x
  x   dx, we perform a long
x+1


division of x + 1 into x to obtain


  x
x+1


      1
1-
    x+1


Then


x dx
Jx+1 d


1-   ) dxz=zx-log x+1|+c.


Alternatively, we could evaluate this integral with the substitution

                                     u=x+1
                                     du = dx.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Integration of Rational Functions


Section 6.4


With this substitution, x = u - 1, so we have


                         S      1 dx       i    du

                                       Q     -I) du

                                       u-log ul + c
                                       x+1-logIx+1|+c.

Note that this is the same answer we obtained above, although with a different constant
of integration.

Partial fraction decomposition: Distinct linear factors
Now we consider the general problem of evaluating

                                        f(x) dx
                                        g(x)
where both f and g are polynomials. We will assume that the degree of g is less than the
degree of f. As illustrated in the first and third examples above, if this is not the case,
we can first perform a long division to simplify the quotient into the form of a polynomial
plus a remainder term which is a rational function with numerator of degree less than the
denominator. To begin we will suppose that g factors completely into n distinct linear
factors. That is, suppose there are constants ai, a2, ... , an and bi, b2,... , b, such that

                      g(x) = (aix + bi)(a2x + b2) -.-. (axz + bn),        (6.4.1)

where the factors on the right are all distinct. From a theorem of linear algebra, which we
will not attempt to prove here, there exist constants A1, A2, ... , An such that

                     f (x) _   A1    +      2A
                     g (X)-Ax        +b         +A2 --- +       .              (6.4.2)
                     g(x)   aim + b1   a2x + b2        anx + bn

The expression on the right of (6.4.2) is called the partial fraction decomposition of f(X)
Once the constants A1, A2, ... , An are determined, the evaluation of

                                        f(x) dx
                                        g(x)
becomes a routine problem. The next examples will illustrate one method for finding these
constants.

Example     To evaluate f   x-21       )dx, we need to find constants A and B such
that
                                 1            A       B
                           (x -2)(x -3)     - 2 +x -3


﻿


Section 6.4                     Integration of Rational Functions

Combining the terms on the right, we have


3


      1
(x - 2)(x - 3)


A(x-3)+B(x-2)
   (x - 2)(x - 3)


Now two rational functions with equal denominators are equal only if their numerators are
also equal; hence we must have


1 = A(x - 3) + B(x -


2)


for all values of x. In particular, for x = 2 we obtain

                                       1= -A,


from which it follows that A


-1, and for x = 3 we have


1 = B.


Thus


      1
(x - 2)(x - 3)


  1       1
x-2 x-3'


so


                 do
f        1
I(x -2)(x -3) d


1
  x-2dx+


11
      dx
x-3


log I -2 +log x


3| + c.


Example


To evaluate


I


      3x
               dx we need to find constants A and B such
(x + 5)(2x - 1)


that
                                 3x
                           (x + 5)(2x - 1)
Combining the terms on the right, we have


A


B


x+5 2x - 1


                             3x
                       (x + 5)(2x - 1)

As before, it follows that
                              3x = A(

for all values of x. In particular, for x =


from which it follows that


A(2x - 1) + B(x + 5)
   (x + 5)(2x - 1)


2x- 1) + B(x + 5)

--5 we obtain

15 = -11A,


      15
      11'


﻿


4


Integration of Rational Functions


Section 6.4


            1
and for x = - we have
            2


3
2


11
-B
2'


from which it follows that


Hence
                              3x
                        (x+ 5)(2x -1)
so


     3
B= .
     11


15 1
11c+5


3    1
112xc-i'


       J       do    15      1
(x +5)(2x -1)        11Jx+5


      3    1
dz+            do
     11 2x- 1


15              3
   log cc +5| + log |2cc
11             22


1| + c.


Partial fraction decomposition: Repeated linear factors
Returning to the general problem of evaluating

                                      f f(cc)dc
                                        g(cc)   '

where f and g are both polynomials and the degree of f is less than the degree of g,
we will now consider the case where g factors completely into linear factors, allowing for
the possibility that one or more of these factors may be repeated. Specifically, suppose
the factor ac + b occurs n times in the factorization of g. Then the partial fraction
decomposition of f cc) must contain a sum of terms of the form
                 g(x)

                           A1        A2               A___
                         acc+ b + (a2c + b)2    -+ (ac + b)T(6.4.3)

for some constants A1, A2,... , An, in addition to similar terms for every other factor of g.
This is best illustrated in an example.


Example
such that


                 cc+1
To evaluate (c+-1    2dc, we need to find constants A, B, C, and D
            (x-  1)3(x-  2) ''                                 '  '


     x+1
(x - 1)3(x - 2)


  A        B          C         D
      +1+                   +
cc-i + (cc-i1)2 (cc-i1)3 cc- 2


(6.4.4)


That is, this partial fraction decomposition contains three terms corresponding to the
factor x -1, since it is repeated three times, and only one term corresponding to the factor
x - 2, since it occurs only once. Moreover, the degrees of the denominators of the terms
for x - 1 increase from 1 to 3. Now combining the terms on the right of (6.4.4), we have


     x+l
(x - 1)3(x - 2)


A(x - 1)2(x - 2) + B(x - 1)(x - 2) + C(x - 2) + D(x - 1)3
                      (x- 1)3(x- 2)


﻿


Section 6.4


Integration of Rational Functions


5


Again, it follows that

       x + 1 = A(x - 1)2(x - 2) + B(x - 1)(x - 2) + C(x - 2) + D(x - 1)3  (6.4.5)

for all values of x. However, because of the repeated factors, we cannot choose values for x
which will isolate each of the constants one at a time as we did in the previous examples.
Instead, we will illustrate another technique for finding the constants. By multiplying out
(6.4.5) and collecting terms, we obtain

x + 1 =A(x3 - 4x2 +5x - 2) + B(x2 - 3x + 2) + C(x - 2) + D(x3 - 3x2 +3x - 1)
      _ (A+D)x3+(-4A+B-3D)x2+(5A-3B+C+3D)x-2A+2B-2C-D

for all values of x. Since two polynomials are equal only if they have equal coefficients, we
can equate the coefficients of x + 1 with the coefficients of the polynomial on the right to
obtain the four equations
                                            A+D~O
                                    -4A+B-3D =40
                                                                                (6.4.6)
                                5A-3B+C+3D= 1
                                -2A+2B-2C-D=1.

From the first equation we learn that

                                      D =-A.

Substituting this into the second equation gives us

                                       B=A.

Substituting both of these values into the third equation results in

                                      C= A+1.

Finally, substituting for D, B, and C in the fourth equation gives us

                            -2A + 2A - 2(A + 1) + A = 1,

which gives us A = -3. Hence B = -3, C = -2, and D = 3. Thus
           f  w+1                   _   _f

        I(w -1)3(w -2)             (xw-(1) d-I(w -l)2d

                                      f/wl2       d3f~2d
                                               3         1
                             - 3logwx-1|+          +        2+3logwx-2|+c.
                                             w -1 (x -1)2


﻿


6


Integration of Rational Functions


Section 6.4


Note that in solving for A, B, C, and D, we could have first substituted x = 1 and x = 2
into (6.4.5) to obtain values for C and D, respectively. These values could have then been
used to simplify (6.4.6) before solving for A and B.
    The Fundamental Theorem of Algebra states that every polynomial factors into a
product of linear factors and irreducible quadratic factors; hence, to complete the story of
integrating rational functions, we need to consider the case where the factorization of the
denominator includes irreducible quadratic factors. However, we will learn in Section 6.5
that for an irreducible quadratic polynomial g,

                                              do
                                         g(x)

involves the inverse tangent function. Thus we need to discuss the inverse trigonometric
functions before continuing the story of integrating rational functions.

Problems

1. Evaluate each of the following integrals.


   (a)          d x
          J c2
   (c)   x-2dx
        jx-2
   (e) I    x+1       dx
         2x2+z - 3
2. Evaluate each of the following integrals.
        /       1
        (x + 2)(x - 4) dx

   (c)   (2x + 3)(x + 1) d


(b)          dx
       x-1

(d)           dx
       x+2

(f)        +       dx
       x2+4x+1


(e)


    x      dx
x2 + x -6


(g) fx2x+6 dx


             3
(b)                  dx
   () (x - 3)(x - 7)
          3x+ 1
(d) f(  2)(   3)dx

                3x
(f) I (x + 2)(x - 3)(x + 1) dx
           3x+2
(h)    (x2 - 4)(x2 - 9)


           1
(b)        1        dx
  ()(x - 1)2(X +2)
           3x+1
(d) f(x+2)3j-l) dx

           4
(f) I (x2 4)2dx


3. Evaluate each of the following integrals.

   (a)            dx
        1(x-1)2
          J   x       dx
   (c)   x2 + 2x + 1

       /     5
   (e)   (+) do


﻿

Section 6.4


Integration of Rational Functions


7


   (g)                  do3x
      / I(x+1)2(x-3)
4. Evaluate each of the following integrals.

   (a) (3x + 2)2 dx

         /9x2 - 4x
   (c) 13x3-2x2+5 dx

   (e)  1214 dx

     (g/     4x+5
     I(x-2)2(x+5)
5. Solve the differential equation

                                (t)= (x(t)

   using the method used to solve the logistic
   x(0) = 0 and -1 < x(t) < 1 for all t.


           5x -1
(h) I(2x+1)2(x+2) dx


        f3
(b)   x2 + 7x + 10 dx

         f2x
(d) J (x2 - 1)(x2 - 4) dx
         1
(f) L       1     dx
      ex2-x-6
         x3
(h) f x2 1 dx


- 1)(x(t) + 1)

differential equation in Section 6.3. Assume


﻿


Section 6.5


                to                     Inverse Trigonometric Functions
       Differential Equations


In this section we will introduce the inverse trigonometric functions. We will begin with
the inverse tangent function since, as indicated in Section 6.4, we need it to complete the
story of the integration of rational functions.
    Strictly speaking, the tangent function does not have an inverse. Recall that in order
for a function f to have an inverse function, for every y in the range of f there must be
exactly one x in the domain of f such that f(x) = y. This is false for the tangent function
since, for example, both tan(0) = 0 and tan(w) = 0. In fact, since the tangent function is
periodic with period 7, if tan(x) = y, then tan(x + nor) = y for any integer n. However,
the tangent function is increasing on the interval (- 2, 2), taking on every value in its
range (-oc, oc) exactly once. Hence we may define an inverse for the tangent function
if we consider it with the restricted domain (- 2 , 2). That is, we will define an inverse
tangent function so that it takes on only values in (-2., i).

Definition The arc tangent function, with value at x denoted by either arctan(x) or
tan-1(x), is the inverse of the tangent function with restricted domain (-.2, i).

    In other words, for -2 < y <

                        y = tan-1(x) if and only if tan(y) = x.                 (6.5.1)

For example, tan-1(0) = 0, tan-1(1) = j, and tan-1(-1) = - . In particular, note that
even though tan(w) = 0, tan1 (0) = 0 since 0 is between -2 and 2, but 7 is not between
-   and 2.
    The domain of the arc tangent function is (-oC, oc), the range of the tangent function,
and the range of the arc tangent function is (- 2, 2), the domain of the restricted tangent
function. Moreover, since
                                    lim tan(x) = oc
                                    X ->2

and
                                  lim tan(x) = -oc,

we have
                                   lim tan-1(x) =                               (6.5.2)
                                   X--O 2
and
                                 lim  tan-1(x) =    -                           (6.5.3)
                                 x--o               2


1


Copyright Q by Dan Sloughter 2000


﻿


2                     Inverse Trigonometric Functions                     Section 6.5

                                         2-

                                         1.5-


                                         0.5-

                  -10         -5                      5           10
                                        -0
                                        -1
                                        -1.5
                                        -2
                          Figure 6.5.1 Graph of y  tan-1(x)


Hence y = 2 and y =-       are horizontal asymptotes for the graph of y = tan-1(x), as
shown in Figure 6.5.1.
    To differentiate the arc tangent function we imitate the method we used to differentiate
the logarithm function. Namely, if y =tan-1(x), then tan(y) = x, so

                                   d           d
                                     tan(y)-=    x.
                                  dx          dx

Hence
                                   sec2(y) dy

from which it follows that
                                    dy       1
                                    dx    sec2(y)
Now
                            sec2(y) = 1 + tan2(y)   + x2

so we have
                                    dy       1
                                    dx    1+x2
Hence we have demonstrated the following proposition.

Proposition
                                  dtan-1(x)      1                              654
                                dx             1 +x2(65)

    As a consequence of the proposition, we also have


                               f  1   dx =tan-1(x) +c.                         (6.5.5)
                               1 +x2


﻿


Section 6.5                   Inverse Trigonometric Functions                  3

    Note that 1 + x2 is an irreducible quadratic polynomial. We will see more examples of
this type in the following examples.
Example Using the chain rule, we have

                             d tan-1(4x2)      8c
                             dc             1 + 16x4


Example    Evaluating f tan-1(x)dc is similar to evaluating f log(x)d. That is, we will
use integration by parts with

                              u =tan-1(x)    dv = d
                                    1
                             du =       dz   v =x.
                                  1 + x2

Then
                     f tan-1(x)dz = x tan-((x) - f 1 + x2 d.

Using the substitution
                                    ii = 1 + cc2
                                    dii   2c
                                    du = 2xdx,

we have -du = cdx, from which it follows that
        2

                      +2do =   udu     -log |ul + c =- log(1 + X2) + c.
                1+   dcc2    2Idiu     lg2          2

Thus
                     tan-1(x)d = x tan-1(x) - - log(1 + X2) + c.


Example    To evaluate f  + 42 dc, we make the substitution

                                     ii = 2cc
                                     du = 2dz.

      1
Then -dui= dcc, so
      2
                 / 11          11                        1
            df           -ii             - tan-l(u) + c = tan-1(2c) + c.


Example    To evaluate f      11dcc, we first note that cc2 + cc + 1 does not factor,
that is, is irreducible, and so we cannot use a partial fraction decomposition. In general,


﻿


4


Inverse Trigonometric Functions


Section 6.5


a quadratic polynomial ax2 + bx + c is irreducible if b2 - 4ac < 0 since, in that case, the
quadratic formula yields complex solutions for the equation ax2 + bx + c= 0. For x2+ x +1
we have b2 - 4ac = -3. In this case it is helpful to simplify the function algebraically by
completing the square of the denominator, thus making the problem similar to the previous
example. That is, since
                                   2            1 2    3
                             x2+x+1=         +-

we have

                  1                    1            4          1
              x2 + x +   d1 dx      +     +   dx    3     (x+)+1 dx.

Now we can make the substitution


                                  du =     dx.
                                        3

       3
Then      du= dx, so


                   /     1           4   3   f1
                        2       do =- -              du
                     x2+x+1 dx3          4   u2+1

                                       - -tan--1(u)+c


                                       - tan--     1(x+A))+c.


Partial fraction decomposition: Irreducible quadratic factors
The last two examples illustrate techniques that we may use to evaluate the integral of
a rational function with an irreducible quadratic polynomial in the denominator. With
this we are now in a position to consider the final case of partial fraction decomposition.
Specifically, suppose we want to evaluate


                                      f f(x)dx
                                      Jg(x)'

where f and g are both polynomials and the degree of f is less than the degree of g.
Moreover, suppose that (ax2 + bx + c)Th is a factor of g, where nt is a positive integer and


﻿


Section 6.5                    Inverse Trigonometric Functions                    5

ax2 + bx + c is irreducible. Then the partial fraction decomposition of must contain
                                                                   g(x)
a sum of terms of the form
                 Aix +B1         A2x+B2                 Anx+ Bn
                 ax2 +bx+c     (ax2 +bx+c)2          (ax2+bx+c)"'

where A1, A2,... , An and B1, B2,... , Bn are constants. Note that the terms in the partial
fraction decomposition corresponding to an irreducible quadratic factor differ from the
terms for a linear factor in that the numerators of the terms in (6.5.6) need not be constants,
but may be first degree polynomials themselves. As before, this is best illustrated with an
example.

Example     To evaluate  Jj1 + X    dx we need to find constants A, B, and C such that
                          x(1+x2)

                                1+x       A   Bx+C
                              x(1+x2)     X+   1+x2

Combining the terms on the right, we have

                           1 + x     A(1 + x2) + (Bx + C)x
                         x(1 + x2)         x(1 + x2)

Hence
                1+x =A(1+x2)+(Bx+C)x= (A+B)x2+Cx+A.
Equating the coefficients of the polynomials on the left and right gives us the system of
equations
                                     A+B=O,
                                         C = 1,
                                         A=1.
Thus B = -1 and
                     1+x        1   1-x      1      1        x
                   x(1+x2)     x    1+x2     x   1+x2     1+x2

Hence
                    1+x                     P
                   I       dx=     -dx+I            dx-       x    dx
                Jx(1+x2)         ]x        J1+x2         I1+x2
                                                    1
                              = log xz + tan--1(x) - -log(1 +x2) +c.

where the final integral follows from the substitution 1 1 1+ x2 as in an earlier example.
                                                             f(x)
    If, unlike this example, the partial fraction decomposition of results in a term of
                                                             g(x)
the form
                                      Ax+ B
                                   (ax2 + bx + c)Th'


﻿


6


Inverse Trigonometric Functions


Section 6.5


                                            2-
                                            1.5
                                            1
                                          0.5

                      -1-0.5                             0.5          1
                                          -0.5-
                                          -1-
                                          -1.5-
                                          -2-
                           Figure 6.5.2 Graph of y = sin--1(x)


where n > 1 and ax2 + bx + c is irreducible, then the integration may still be difficult to
carry out, perhaps even requiring some of the ideas of trigonometric substitutions that we
will discuss in the next section. However, there is a limit to what should be done without
the aid of a computer, or at least a table of integrals. There is a point after which some
integrations become so complicated and time-consuming that in practice they should be
given to a computer algebra system.

The inverse sine function
The remaining trigonometric functions all have inverses when their domains are restricted
to appropriate intervals. Since the sine function is increasing on the interval [- 72, 1],
taking on every value in its range [1, 1] exactly once, we obtain an inverse for the sine
function by restricting its domain to [- 2 , 2].

Definition    The arc sine function, with value at x denoted by either arcsin(x) or sin--1(x),
is the inverse of the sine function with restricted domain [- 2, 2].
    In other words, for -   < y <   ,

                          y = sin-1(x) if and only if sin(y) = x.                  (6.5.7)

For example, sin-1(0) = 0, sin-1 ( ) =  , sin-(1) = 2, and sin-1(-1) =-. Note that
the domain of the arc sine function is [-1, 1], the range of the sine function, and the range
of the arc sine function is [- , 2], the domain of the restricted sine function.The graph
of y =sin-1(c) is shown in Figure 6.5.2.
    To find the derivative of the arc sine function, let y =sin-1(c). Then sin(y) =cc, so

                                     d.          d
                                     dsin(y) dxx.

Hence
                                            dy
                                      cos~y = 1,


﻿


Section 6.5


Inverse Trigonometric Functions


7


and so
                                    dy      1
                                    dc   cos(y)
Now
                           cos2(y) = 1 - sin2(y) = 1 + x2

so cos(y)  +  +1 -x2. Since - 2 G y < 2, cos(y);>0. Thus cos(y) =  1 - x2, and


dy
dx


  1
1 - X2


Proposition


d i 1           1c2


(6.5.8)


As a consequence of this proposition, we also have


f    1     d = sin-1(x) + c.
J 1-cc2


(6.5.9)


Example Using the product and chain rules,


d  (x sin-1(2x))


2x
       + sin-1(2x).
1 -4x2


Example     To evaluate           dc, we first note that
      ExapleToevauat J/     4_- 2


  1
4-x2


Then the substitution


  1

4(1-)


      2
      1
du = - dz
      2


1    1
2 1x
      1-


gives us


    4 1 d
J   4-cc2


     1
  f       du= sin-
J1_-2


1(u) + c =sin-1(- + c.


The inverse secant function
Defining an inverse for the secant function is slightly more complicated than defining the
arc tangent or arc sine functions. On the interval [0, 2), the secant function is increasing


﻿


8


Inverse Trigonometric Functions


Section 6.5


                                          3.5

                                          3

                                          2.5

                                          2

                                          1.5

                                            1

                                          0.5

                       -4         -2                   2         4
                           Figure 6.5.3 Graph of y = sec-1(x)


and takes on all values in the interval [1, oo); on the interval ( 2, 7], the secant function
is also increasing, taking on all values in the interval (-oo, 1]. Hence between these two
intervals the secant function takes on every value in its range exactly once. From these
considerations we obtain the following definition.
Definition    The arc secant function, with value at x denoted by either arcsec(x) or
sec-1(x), is the inverse of the secant function with domain restricted to the intervals
[0, 2) and (2, x].
    Thus for 0 < y < 2 or 2 < y < w,

                         y =sec1(x) if and only if sec(y) = x.                   (6.5.10)

For example, sec-1(2) = j, sec-1(1) = 0, sec1(-2) = , and sec-1(-1) =w7. Note that
the domain of the arc secant function consists of the two intervals (-oo, -1] and [1, oc),
the range of the secant function, and the range is composed of the two intervals [0, j) and
(i=, 7], the domain of the restricted secant function.
    Since
                                     lim sec(x) = 0

and
                                    lim  sec(x)=-o,

it follows that
                                    lim sec-1() =-(6.5.11)

and
                                    lim  sec-1() =  .(6.5.12)
                                  x - -00           2
Thus the line y ~= is a horizontal asymptote for the graph of y =sec-1(c) both as cc
goes to oc and as cc goes to -0o, as shown in Figure 6.5.3.


﻿


Section 6.5


Inverse Trigonometric Functions


9


To find the derivative of the arc secant function, let y =sec-1(x). Then sec(y) = x, so


d
dcsec(y)


d
  d.
do


Hence


and so


            dy
sec(y)tan(y)d     1,


dy          1
dx    sec(y)tan(y)


Now sec(y) = x and


tan2(y) = sec2(y) - 1 = x2


1.


Hence tan(y) =/x2 - 1. If x is in [1, oo), then0 < y < 2 and tan(y);> 0; ifxis in
(-oc, -1], then  < y < r and tan(y) < 0. Thus


                      sec(y) tan(y) =(ccc2 -1,   if x>1
                                      L-cc c2 - 1, if cc <-1.


Since Ixl = x when x> 1 and Ic


-x when x < -1, it follows that


sec(y) tan(y) = I| x2-1.


Hence


dy        1
dc    ccc2i'


Proposition


d      1
d sec -(c)


    1
Icc x2-1


(6.5.13)


Example Using the chain rule, we have

                     d   e1c3                            1
                        sec--(3x)
                     dc             |3xl9x2- 1      lc 9x2 -1

    We will leave the definition of inverse functions for the cotangent, cosine, and cosecant
functions for the problems at the end of the section. In the next section we will see how
the arc tangent, arc sine, and arc secant functions are useful in evaluating certain integrals;
the arc cotangent, arc cosine, and arc cosecant functions could be used in similar roles,
but, wherever they are used, we could just as well use arc tangent, arc sine, or arc secant.
Hence the former, although useful in other situations, will not be as important for our
present study as the latter.


﻿


10                     Inverse Trigonometric Functions

Problems

1. Find the derivatives of each of the following functions.


Section 6.5


(a) f (x) = x tan-1(x)


(c) g()


sin- (3x)
    cc


(b) g(t) =tan-1(3t2)

(d) f (x) = 3x sec-1(5x)


2. Evaluate each of the following.

   (a) tan-


   (c) sin-


   (e) sin-sin (v)

3. Evaluate the following integrals.

   (a)    1 +2x2 dcc

   (c)    X2+4 dcc

   (e) f   2+4x+5 dcc


   (g) 11+x2 dcc

4. Evaluate the following integrals.

   (a)           dcc

   (c)    x2(x2 + 1) dc

   (e) fsin-c(x)dc


(b) tan-1(- 3)


(d) sec1


23


(f) sin (sin(


2)


(b)    2 cx2 dc


(d)    X2 + 2x +3d

(f)    cc x    +6 dc


(h) j


1
1+c2 dcc


(b) Ix(4x2-+1) dc

(d)    (x+1)     2)    dcc

(f) ftan-(3x)d

(h) J  4-82 dc


(j)   -2v16-2 dc
the arc cosine function, if its domain is


(g)J


        dcc
1 - 9cc2


       1 3x        do
       ()J 1-cc2dc
5. The cosine function has an inverse, called
   restricted to [0,7]. That is, for 0 < y < r,


y =cos-1(x) if and only if cos(y) = x.


﻿


Section 6.5


Inverse Trigonometric Functions


11


                   d                  1 ____
    (a) Show that     cos1(x) =-
                   dx                1 -x2

    (b) Show that sec-1(x) =cos1 (I).

                                       d
    (c) Use the result from (b) to find  sec-1(x).
                                      do
    (d) Use the fact that
                                 d      1    _d
                                 dx sin-(x) = dx(-cos-1(x))
        to show that
                                   sin-1(x) + cos-1(x) = -

        for all x in [-1, 1].
 6. The cotangent function has an inverse, called the arc cotangent function, if its domain
    is restricted to (0,7). That is, for 0 < y < r,
                           y   cot1(x) if and only if cot(y) = x.
               d 1
    Show that     cot-1(x)   - _      .1
               dx              1 + x2
 7. The cosecant function has an inverse, called the arc cosecant function, if its domain is
    restricted to the intervals [-i, 0) and (0, 2]. That is, for - 2<y < 0 or 0 < y < ,
                           y = csc-1(x) if and only if csc(y) = x.
               d 1
    Show that     csc-1(x) = - x.x2
               dx||                z2  _1

 8. Evaluate 1112 dx.

 9. (a) Use the fact that
                                   tan-1(x) =J     1+2 dt

        to find the Taylor series expansion for tan-1(x) about 0. On what interval does
        this series converge?
    (b) Use your result in (a) and the fact that 7 = 4 tan-1(1) to approximate 7 with an
        error of no more than 0.001.
10. (a) Show that
                                   d      i(1\          1
                                   tan- -K-}

    (b) Use the result from (a) to show that

                                  tan-1(x) + tan- }

        for all x > 0.
    (c) Find a result similar to (b) for x < 0.


﻿


Section 6.6


Trigonometric Substitutions


In the last section we saw that


      1 d
           dx
I l1


sin-1(x) + c.


However, we arrived at this result as a consequence of our differentiation of the arc sine
function, not as the outcome of the application of some systematic approach to the evalua-
tion of integrals of this type. In this section we will explore how substitutions based on the
arc sine, arc tangent, and arc secant functions provide a systematic method for evaluating
integrals similar to this one.


Sine substitutions
To begin, consider evaluating


by using the substitution u = sin-
the fact that, for -1  u   2 , u
we see that


               dx
       /1 - x2
1(x). The motivation for such a substitution stems from
= sin-1(x) if and only if x = sin(). In the latter form,

   dx = cos(u)du


and


1 - x2 = 1 - sin2(u)


cos2(U) =  cos(U)|.


(6.6.1)


Since cos(u) ;> 0 when 2 < u < 2, (6.6.1) becomes

                                    1 - x2 = cos(u).


Thus


        Sdx
/1 - x2


Icos(tt)
cos(U) du


du=u+ c


sin-1(x) + c.


    Of course, there is nothing new in the result itself;
generalize to other integrals of a similar type, which
integral with a factor of the form
                                        a2 - x2


it is the technique, which we may
is of interest. Specifically, for an


or


  1
a2 - x2'


1


Copyright @ by Dan Sloughter 2000


﻿


2                       Trigonometric Substitutions

where a > 0, the substitution
                                            7r 7r
                               x = asin(u), - <u< 5-,
                                            2        2
may prove to be useful because of the simplification


Section 6.6


(6.6.2)


a2 - x2 =   a2 - a2sin2(x)


a2(1 - sin2(u)) = a cos2(u) = a cos(u).


(6.6.3)


Although this substitution is equivalent to the substitution ui  sin-1 (g), we will see that
it is more convenient to work with it in the form x = a sin(u).

Example     To evaluate the integral f  91  dx, we make the substitution
                                    J 9-
                                          9-x2
                                             Tr 7r
                               x = 3sin(u),-- <u< -
                                             2        2
                              dx = 3 cos(u)du.


Note that we omit both u =
defined at either x = -3 or x


-' and u
2
3. Then


/ 1 9-  dx


= 2 since the function being integrated is not


  J    3cos(u)
                  du
      9- 9 sin2(u)
 I     3cos(u)     du
   3 1 - sin2(u)
   f cos(u)
   (Ji)du
      cos2(i)
 f cos(u) du
   cos(u)

 Jdu

 u + c.


Now x = 3sin(u) implies that u


sin-1 (i), so we have


S  9-X2


dx =sin-1 (  + c.


Example     To evaluate f  4 - x2 dx, we make the substitution

                                             7r 7r
                               X =2 sin(u), -- 5  5-,
                                             2 2
                              dx = 2 cos(u)du.


﻿


Section 6.6


Trigonometric Substitutions


        2
                       x


3


Figure 6.6.1 Right triangle with sin(u)


x
2


Then


I


4- x2 dx


2f    4 - 4 sin2(u) cos(u)du

4f    1 - sin2(u) cos(u)du

4     cos2(u) cos(u)du

4   cos2(u)du

4 1 + cos(2u) du
f 2
2 f(i + cos(2u))du

2u + sin(2u) + c
2u + 2 sin(u) cos(u) + c.


Since x = 2 sin(u), sin(u)


2 and u = sin-- (). Moreover,


cos2(u) = 1 -


sin2(i) -x
           =1 - 4


4- x2
  4,'


so
                                 cos(U) =1 4-x2
                                          2
where we have, once again, used the fact that cos(u) > 0 since - 2 i< u <2. Note that
this expression for cos(u) may also be deduced from Figure 6.6.1, where we have a right
triangle with an acute angle of size u such that sin(u) = x. Putting everything together,
we have


I.


4-x2d=2sin--    +1x 4-x2+c.


    Notice that a considerable amount of the work in the previous example involved ex-
pressing the answer in terms of x once it had been found in terms of u. The next example


﻿


4


Trigonometric Substitutions


Section 6.6


illustrates how this work is unnecessary when evaluating definite integrals since we can
change the limits of integration and, from that point on, do all our work in terms of u.
Example     Recall that, for r > 0, the graph of y = r2 - x2 is the upper half of a circle
of radius r centered at the origin. Hence we should have

                               2f     r2 - x2 dr= r2.

We are now able to verify this. Let
                                            7rT7
                              x = r sin(u), - - < U <-
                                             2 2
                             dx = r cos(u)du.

Then u =sin-1 (r), so when x = -r,

                                u =sin-1(-1)     --,
                                                   2
and when x = r,
                                  u =sin-1(1)= .

Thus
                 2      r2 - x2dx = 2 f      r2 - r2 sin2(u) rTcos(u)du


                                  = 2     r2   cos2(u)cos(u)du


                                           2
                                  = 2r2     Cos 2(u)du


                                          2_2

                                    r2 fJ(1 + cos(2u))du

                                    r2 (+ sin(2u))    2
                                               2_-2
                                  =fr2f     +0-(-       +0


Tangent substitutions
In a similar fashion, the substitution
                                            7r       7r
                              x =a tan(u)' 2- <    I'-                         (6.6.4)


﻿


Section 6.6                      Trigonometric Substitutions

may be useful for integrals which have a factor of the form

                                      a2 + x2,

                                        1
                                      a2 + x2
or
                                         1
                                      a2 + x2
because of the simplification


5


a2 + x2 = a2 + a2tan2(u)


a2(1 + tan2(u))


a2 sec2(u).


(6.6.5)


Note that with our restriction on u, this substitution is equivalent to the substitution
u = tan-- (a).


Example


To evaluate the


integral 14+x2 dx, we make the substitution


x = 2 tan(u),


- <U < -,


dx = 2 sec2(u)du.


Then


       dx
4+x2


f    2sec2(i)
   4+4tan2(u)
1     sec2(u)
                du
2    1+ tan2(u
1   sec2(u)
2   sec2(u)

Aldu

1
-u+c.
2


Since x = 2 tan(u), u


tan--l (). Thus


f 1
  4+x2


1  (x+
- tan-- (-)+c.
22


Example To evaluatef
                         /1+x


dx, we make the substitution


x = tan(u), --<     < -,
              2        2
dox sec2(u)du.


﻿


6


Trigonometric Substitutions


Section 6.6


x


                       1
Figure 6.6.2 Right triangle with tan(L


) x


sec(u)|


Then


1 + x2     1+tan2(u)


sec2(u)


Since -2 <u < 2, sec(u) > 0, so | sec(u)    sec(u). Hence


J 1
        1 dx
   I+x2


f sec2(u)
          du
 , sec(u)


                                     - fsec(u)du

                                     = log |Isec(u) + tan(u)| + c,

where the final integral follows from an example in Section 6.2. Now tan(u) = x, so

                            sec2(u) = 1 + tan2(u) =1 + x2.

Since sec(u) > 0, it follows that

                                   sec(u)=-   1 +x2

Note that this expression for sec(u) may also be deduced from Figure 6.6.2, where we have
a right triangle with an acute angle of size u such that tan(u) = x. Thus


                        I    1    2 dx=log      1+cx2+cx +c.


Secant substitutions
For integrals involving a factor of the form

                                         cx2 - a2

or
                                           1
                                         cx2 - a2


﻿


Section 6.6                      Trigonometric Substitutions                    7

where a > 0, the substitution
                                   x = a sec(u)                             (6.6.6)
may be useful. With this substitution,

      cx2 - a2 =  a2 sec2(u) - a2 = a sec2(u) - 1 = a tan2(u) = al tan(u)|. (6.6.7)

Now x2 - a2 is meaningful only if either x ;> a or x < -a. Since xc= a sec(u), the former
case corresponds to u in the interval [0, 2) and the latter to u in the interval (i-,7r]. For
0 < u <  , tan(u);>0, so
                                 x2 - a2= a tan(u);
for 2 < u < r, tan(u) < 0, so

                                x2 - a2 = -atan(u).

Hence it is important when evaluating integrals of this type to be careful about which
values of x are of interest.

Example     To evaluate f   29dc for x> 3, we make the substitution


                              x= 3sec(u),0 < u < -

                              dc = 3 sec(u) tan(u)du.

Then
                      SJ        d13sec(u)tan(u)
                                dcc=                   dui
                          cc2 - 9    J    9sec2(u) -9
                                      f 3sec(u) tan(u) du
                                         3 sec2() - 1
                                      f sec(u) tan(u) du
                                           tan2( )
                                      f sec(u) tan(u) du
                                           tan(u)

                                        secu)dui

                                    log sec(i) +tan(i)| + c.
Now sec~u) = , so


                      tan2 (i) = sec2()-    ccc
                                             9          9
Hence
                               tan(i) - -xc2 - 9,
                                        3


﻿


8


Trigonometric Substitutions


Section 6.6


x2- 9


                                        3
                     Figure 6.6.3 Right triangle with sec(u) = 3


where we have used the fact that tan(u) > 0 since 0 < u < i. Note that this expression
for sec(u) may also be deduced from Figure 6.6.3, where we have a right triangle with an
acute angle of size u such that sec(u) = 3. Thus


x2 _ 9


log   +     x2- 9 = logx +  x2 - 9| - log(3) + c.
    3 3


Since log(3) is a constant, we may combine it with the arbitrary constant of integration.
Moreover, since we are assuming x > 3, we may remove the absolute value and write


  1      dx
I2 -c 9


log(x +  X2-9) + c.


Problems

1. Evaluate the following integrals.

    (a) f     -     d
         16 -\/26

    (c) f   5-z2dz

          /1
    (e) J  +d4+cx2

 2. Evaluate the following integrals.

    (a) fz 1-z2dz


    (c)fx2_4 dx, x > 2

       J      4    d
    (e)             do/_2x


(b)  4-cc2 dx

(d)   6 +  dc

         4x
    S1 + x2 d


(b)      1   )  dc
     J(1 - x2

(d) f    4 dx,x < -2

         3
   M 5+2x2 d


﻿


Section 6.6


Trigonometric Substitutions


9


   (g) f  4-t2 dt

3. Evaluate the following integrals.

   (a)     9-t2dt
       /3 1
   (c)             dx
      Jo (1+x2)2
      e    10

   (e) ]    v/ 2-25 dx
4. Evaluate


   using (a) partial fractions and (b)
   compare?


(h) f1-4x2 dx


(b)      1 + x2 dx


(d)   x2 1 - x2 dx

      2    3
(f)    X2         dx
        S x2-1


__ dx
1 -x2


the substitution x =


sin(u). How do the two methods


5. Evaluate


   using the substitution x

6. (a) Evaluate


      for x < -3.
  (b) Show that


I


  1
       do
v1 - X2


cos(u) with 0 <u < r.


         Ido
            vx2-_9


                   Il        dx
                        29
for both x> 3 and x < -3.


log|x+   x2-9|+c,


﻿


Section 6.7


                to                    Hyperbolic Functions
       Differential Equations


The final class of functions we will consider are the hyperbolic functions. In a sense these
functions are not new to us since they may all be expressed in terms of the exponential
function and its inverse, the natural logarithm function. However, we will see that they
have many interesting and useful properties.
Definition For any real number x, the hyperbolic sine of x, denoted sinh(x), is defined
by
                                          1
                                sinh(x) = -(CX - e-X)                          (6.7.1)
                                          2
and the hyperbolic cosine of x, denoted cosh(x), is defined by
                                          1
                               cosh(x) = -(C + e-X).                           (6.7.2)
                                          2

    Note that, for any real number t,

          cosh2(t) - sinh2(t) = -(et + e-t)2 _(t _ e-t)2
                              4              4
                            = A(e2t + 2ete-t + e 2t) _ 1 (C2t - 2ete-t + e 2t)
                              4                       4

                              -(2 +2)
                              4
                              1.

Thus we have the useful identity

                                cosh2(t) - sinh2(t) = 1                        (6.7.3)

for any real number t. Put another way, (cosh(t), sinh(t)) is a point on the hyperbola
x2 _ 2 = 1. Hence we see an analogy between the hyperbolic cosine and sine functions and
the cosine and sine functions: Whereas (cos(t), sin(t)) is a point on the circle x2 + y2  1,
(cosh(t), sinh(t)) is a point on the hyperbola x2 - y2 = 1. In fact, the cosine and sine
functions are sometimes referred to as the circular cosine and sine functions. We shall see
many more similarities between the hyperbolic trigonometric functions and their circular
counterparts as we proceed with our discussion.
    To understand the graphs of the hyperbolic sine and cosine functions, we first note
that, for any value of x,
                                     1
                         sinh(-x) = -(e-x - c)     - sinh(x),                  (6.7.4)
                                     2


1


Copyright @ by Dan Sloughter 2000


﻿


2


Hyperbolic Functions


Section 6.7


10


7.5


5


2.5


-4


2         4


      1        -10 L
 Figure 6.7.1 Graph of y


            1
cosh(-x) = (e-x + cx)
            2


sinh(x)


and


cosh(x).


(6.7.5)


Now for large values of x, e--   0, in which case


          1(
sinh(x)=-(
          2


          1
e-x)     -e
         2


and
                                                       1
                             sinh(-x)    -sinh(x)        e --e.
                                                       2
Thus the graph of y = sinh(x) appears as in Figure 6.7.1. Similarly, for large values of x,

                                        1               1
                              cosh(x)     (e +    -)~
                                        2              2

and
                                                      1
                               cosh(-x) = cosh(x)  -ex.
                                                      2
The graph of y = cosh(x) is shown in Figure 6.7.2.
    The derivatives of the hyperbolic sine and cosine functions follow immediately from
their definitions. Namely,


d.
d sinh(x)


d
dxcosh(x)


d 1
dx 2


ex) =   1
  e--" = (eX + e--x)
        2


cosh(x)


and


d 1
    ~(ex + e--x)
dz 2


1
-(ex - e--x)


sinh(x).


Here again we see similarities between the circular and hyperbolic sine and cosine functions.


﻿


Section 6.7                         Hyperbolic Functions                         3

                                        10

                                        8

                                        6

                                        4

                                        2


                      -4        -2                 2        4
                          Figure 6.7.2 Graph of y = cosh(x)


Proposition
                                 d sinh(x) = cosh(x).                        (6.7.6)
                                 dx
                                 d cosh(x) = sinh(x).                        (6.7.7)
                                 do

    As a consequence of this proposition, we also have

                             J sinh(x)dx= cosh(x) + c                        (6.7.8)

and
                             J cosh(x)dx= sinh(x) + c.                       (6.7.9)

Example Using the chain rule, we have

               d sinh2(3x) = 2 sinh(3x)  sinh(3x) = 6 sinh(3x) cosh(3x).


Example Using the chain and product rules, we have

           d
           dsinh(2x) cosh(2x) =sinh(2x)(2 sinh(2x)) + cosh(2x)(2 cosh(2x))
                               =2 sinh2(2x) + 2 cosh2(2x).


Example Analogous to

                            Isin(3x)dx =-- cos(3x) + c,


﻿


4


Hyperbolic Functions


Section 6.7


4


2


L/TTTTTTTTi


I I I I I I I I I I    III  I I I I I


-10          -5

                        -2


5


10


-4|-


Figure 6.7.3 Graph of y =sinh-1(x)


we have


I sinh(3x)dx


- cosh(3x) + c.


Example It is tempting to evaluate


J e- sinh(x)dx


using integration by parts in the same manner that we would evaluate

                                   Je-x sin(x)dx.

However, this integral is much easier if we notice that


   e-- sinh(x) = e--(2(ex


f- xsinh(x)d  2f(i -


C-")


-(1
2


e-2x)


Hence


-2xd = + -2x
          2 4


    Since A sinh(x) = cosh(x) > 0 for all x, the hyperbolic sine function is increasing
on the interval (-oc, oc). Thus it has an inverse function, called the inverse hyperbolic
sine function, with value at x denoted by sinh--1(x). Since the domain and range of the
hyperbolic sine function are both (-oc, oo), the domain and range of the inverse hyperbolic
sine function are also both (-oc, oc). As usual with inverse functions,

                       y =sinh--1(x) if and only if sinh(y) = x.           (6.7.10)


The graph of y =sinh-1(x) is shown in Figure 6.7.3.


﻿


Section 6.7                           Hyperbolic Functions                            5

Example The hyperbolic sine function and its inverse provide an alternative method
for evaluating
                                                dx.
                                    J     1+  2
Namely, if we make the substitution

                               X= sinh(u), -oc < u < oc,
                               dx =cosh(u)du,

then
                     1 + X2 =V1 + sinh2(u) =  cosh2(u) = cosh(u),

where the second equality follows from the identity cosh2 (u) - sinh2(u) = 1 and the last
equality from the fact that cosh(u) > 0 for all u. Hence

             f    1     dxfcosh(u) du =         du = u + c =sinh-i(x)+c.
             J   1+x2        J cosh(u)                  +

    The following proposition is a consequence of the previous example.
Proposition
                                d                  1
                                   sinh-1(x) =          .                       (6.7.11)
                                dz               1 +x2                          (..1

    In Section 6.6 we saw, using the substitution x = tan(u), -2 <u <  , that


                        I    1    2 dx=log x+       1+x2 +c.
Since two antiderivatives of a function can differ at most by a constant, there must exist
a constant k such that
                           sinh-1(x) =log x + 1 + x2 + k

for all x. Evaluating both sides of this equality at x = 0, we have

                            0 =sinh-1(0) = log(1) + k = k.

Thus k = 0 and
                             sinh-1(x) = log x +    1 + x2                      (6.7.12)

for all x. Since the hyperbolic sine function is defined in terms of the exponential function,
we should not find it surprising that the inverse hyperbolic sine function may be expressed
in terms of the natural logarithm function.
    Similarly, since A cosh(x) =sinh(x) > 0 for all cc > 0, the hyperbolic cosine function
is increasing on the interval [0, 00), and so has an inverse if we restrict its domain to [0, 00).
That is, we define the inverse hyperbolic cosine function by the relationship


y = cosh-1(x) if and only if x = cosh(y),


(6.7.13)


﻿


6


Hyperbolic Functions


Section 6.7


5

4

3

2

1


2


4


6         8


10


                          Figure 6.7.4 Graph of y =cosh-1(x)


where we require y > 0. Note that since cosh(x) ;> 1 for all x, the domain of of the inverse
hyperbolic cosine function is [1, oc). The graph of y =cosh-1(x) is shown in Figure 6.7.4.
    In Problem 3 at the end of this section you are asked to show that


I


   1
   _1 dx=cosh-1(x) + c
v/z2 _-i


for x > 1, from which the following proposition follows.

Proposition


d cosh-1(x)
dz


   1
/2 _-1


(6.7.14)


In the same problem you are asked to show that, for x> 1,


cosh-1(x) =log x +  x2 - 1


(6.7.15)


Example In Section 6.6 we evaluated the integral


                                                dz
                                    J 1


for x > 3, using the substitution x = 3 sec(u), 0 <u < 2. The substitution

                                  X = 3 cosh(u), u > 0,
                                  dc = 3 sinh(u)du


﻿


Section 6.7                           Hyperbolic Functions                            7

provides a somewhat simpler approach. Namely,

                         f1    '   d     f      3sinh(u)
                                 2 do                        du
                                              9 cosh2 () - 9
                                         -f     3sinh(u)     du

                                            3   cosh2 (i) - 1
                                              sinh(u)   du

                                              sinh2(u)
                                          /sinh(u) du
                                            sinh(u)

                                       =J du

                                       -iu+c

                                       =cosh-1 (     + c,

where we have used the fact that sinh(u) > 0 when u > 0.
    Having defined the hyperbolic sine and cosine functions, it is possible to define four
more hyperbolic trigonometric functions in analogy with the circular trigonometric func-
tions. Namely, the hyperbolic tangent function is given by

                                             sinh(xc)
                                  tanh(x) =csh()                                (6.7.16)
                                             cosh(x)

where -oc <cx < oc; the hyperbolic cotangent function by

                                   coth(x) =cosh(c)                             (6.7.17)
                                             sinh(x)

where x # 0; the hyperbolic secant function by
                                                1
                                   sech(x) =         ,                          (6.7.18)
                                             cosh(x)

where -oo <cc < oo; and the hyperbolic cosecant function by

                                                1
                                   csch(xc) =.       ,(6.7.19)
                                             sinh(c)'

where cc # 0. In Problem 5 at the end of this section you are asked to verify the following
results.
Proposition
                                  d
                                  dtanh(cc) =sech2(cc).                         (6.7.20)


﻿


8


Hyperbolic Functions


Section 6.7


Proposition


d coth(x)


-csch2(x)


(6.7.21)


Proposition


Proposition


d
d sech(x)


d
dcsch(x)


-sech(x) tanh(x).


-csch(x) coth(x).


(6.7.22)


(6.7.23)


Since


   th~~ -sinh(x)
tanhx) =csh(x)
           cosh(x)


ex - e-x
ex + e-x'


we have


lim tanh(x)
x- 00


and


rn e - e-
x-m0ex+ e-x
  .  e (1 - e-2x)
x-o ex (1 + e-2x)
     1 - e-2x
 lim
 X-o 1+ e-2x
 1

       ex - e-x
 lim
 x--m ex+ e-x

 .rn e-x(e2x - 1)
 x--o e-x(e2x + 1)
       e2x_1
 lim
 x--o e2x + 1
 -1.


lim tanh(x)
x--00


Hence y = 1 and y = -1 are both horizontal asymptotes for the graph of y = tanh(x).
Combining this information with tanh(O) = 0 and

                                J tanh(x) = sech2(x) > 0

for all x, we can see why the graph of y = tanh(x) looks as it does in Figure 6.7.5.
    Since the hyperbolic tangent function is increasing on (-oc, o), it has an inverse, called
the inverse hyperbolic tangent function, with value at x denoted by tanh-1(x). That is, as
usual,


y = tanh-1(x) if and only if tanh(y) = x.


(6.7.24)


The domain of the inverse hyperbolic tangent function is (-1, 1) the range of the hyper-
bolic tangent function, and its range is (-oc, oc), the domain of the hyperbolic tangent


﻿


Section 6.7


Hyperbolic Functions


9


1.5 r


1


0.5


r . .


I I I I I I I I I I I I I   | | | | I I I I  III  I


-4        -2

                  .5

                  -1


2


4


                                        -1.5 L
                          Figure 6.7.5 Graph of y = tanh(x)


function. Corresponding to the horizontal asymptotes of the graph of the hyperbolic tan-
gent function, the graph of the inverse hyperbolic tangent function has vertical asymptotes
x = -1 and x = 1, as shown in Figure 6.7.6.

Example As an alternative to using partial fractions, we may evaluate the integral

                                        1 1    dz
                                     J 1
                                        1-x2

for -1 <cc < 1 using the substitution

                              x = tanh(u), -oc < U < 0c,
                              dc = sech2(u)du.


Then


       doc
1 -x2


f   sech2(u)   du.
  1 - tanh2(u)


Now from the identity


we obtain


cosh2 (x) - sinh2(x) = 1


cosh2 (X)
cosh 2(x)


sinh2 (x)
cosh 2(x)


    1
cosh2(x)


In other words,


1 - tanh2(x) = sech2(x).


(6.7.25)


Hence


J 12
   1-cc dcc


f sech2(u)
           du
Isech2(u


J du = u + c = tanh-1 (x) + c.


﻿


10


Hyperbolic Functions


Section 6.7


4


2


-1.5


0.5      1       1.5


-2


-4


                     Figure 6.7.6 Graph of y = tanh-1(x)


Note that (6.7.25) gives us the useful identity

                           tanh2 (x) + sech2 (x) = 1


(6.7.26)


for all x. Moreover, we have the following proposition as a consequence of this example.


Proposition


d
dx tn1()


  1
1-x2


(6.7.27)


    If we were to use partial fractions to evaluate the integral of the previous example, we
would obtain, for -1 <cx < 1,


I


1
-X2 dx


1
- log(1 + cc)
2


1
- log(1
2


x)+c =-log (I+x)+ c.
         2      1 -


It follows that


tanh-1(x)


- log (1+x\ +k
   2   1      +


for some constant k. Evaluating at 0, we have

                                      0=0+k.


Thus k = 0 and we have


tanh-1(x)


1    (1+cx
- log (
2    \1 -cJ


(6.7.28)


for 1 <cc < 1.


﻿


Section 6.7                         Hyperboli

Problems

1. Differentiate each of the following functions.
    (a) f (x) = sinh(3x)
    (c) f (t) = 3t sinh(t) cosh(2t)
    (e) y(t) = 5t2 cosh2(4t)
 2. Evaluate each of the following integrals.

    (a) fsinh(3x)dz

    (c) f sinh(z) cosh(z)dz

    (e) f e-2t cosh(2t)dt

    (g) f 5t2 cosh(2t)dt

 3. (a) Use the substitution x = cosh(u), u > 0,1


                               f     1d


c Functions


11


(b)
(d)
(f)


g(t)
g(x)
f(t)


3t cosh(4t)
: 4x sinh(3x2 -1)
3cosh2(2t) - 13sinh(3t2)


(b) fcosh(4t - 3)dt

(d) f3xsinh(2x)dz

(f) f cosh2(x) sinh(x)dz

(h)    sinh(t) dt
       cosh2(t)
to show that

= cosh-1(x) + c


    for x> 1.
(b) Use the substitution x = sec(u), 0 <

                        I     1     dcc
                            /x2_1


u < 2 , to show that

=log x+  x2-1 + c


    for x> 1.
(c) Using (a) and (b), show that

                            cosh-1(x) =log x +    x2 -1


      for x> 1.
4. Evaluate the following integrals.

       /     1
   (a) J    +cd4+x2d

      /'     3
   (c) J   9 + 3t2 dt

5. Verify the following derivatives.

   (a)    tanh(x) = sech2(x)


(b)            dx, x > 2
         2_4
(d)f      2    dx, x<


(b)    coth(x) =_-csch2(


1


(I)


﻿


12


Hyperbolic Functions


Section 6.7


   (c) Jsech(x) = -sech(x) tanh(x)

6. Differentiate each of the following functions.
   (a) f (x) = 3x tanh(4x)
   (c) h(O) = 4tanh2(O)sech(O)
7. Evaluate each of the following integrals.


(d)  csch(x)


(b) g(t) =sech
(d) f (x) =5


.csch(x) coth(x)


h2(3t)
sech(4x) - 21 tanh3(4x)


   (a) ftanh(x)dx                             (b) ftanh(2x)sech(2x)dx

   (c)   4 1x2 dx                             (d) f     5-3t2 dt

8. Graph each of the following functions on an appropriate interval.
   (a) y = sech(x)                            (b) y = coth(x)
   (c) y = csch(x)                            (d) y = 3tanh(4x)


﻿


Section 7.1


               to                    The Algebra of Complex
       Differential Equations        Numbers


At this point we have considered only real-valued functions of a real variable. That is,
all of our work has centered on functions of the form f : R -- R, functions which take a
real number to a real number. In this chapter we will discuss complex numbers and the
calculus of associated functions. In particular, if we let C represent the set of all complex
numbers, then we will be interested in functions of the form f : R -- C and f : C - C .
We will begin the story in this section with a discussion of what complex numbers are and
how we work with them.
    Perhaps because of their name, it is sometimes thought that complex numbers are in
some way more mysterious than real numbers, that a number such as i     -I is not as
"real" as a number like 2 or -351.127 or even r. However, all of these numbers are equally
meaningful, they are all useful mathematical abstractions. Although complex numbers
are a relatively recent invention of mathematics, dating back just over 200 years in their
current form, it is also the case that negative numbers, which were once called fictitious
numbers to indicate that they were less "real" than positive numbers, have only been
accepted for about the same period of time, and we have only started to understand the
nature of real numbers during the past 150 years or so. In fact, if you think about their
underlying meaning, r is a far more "complex" number than i.
    Although complex numbers originate with attempts to solve certain algebraic equa-
tions, such as
                                    x2 +1=0,

we will give a geometric definition which identifies complex numbers with points in the
plane. This definition not only gives complex numbers a concrete geometrical meaning,
but also provides us with a powerful algebraic tool for working with points in the plane.

Definition   A complex number is an ordered pair of real numbers with addition defined
by
                            (a, b) + (c, d) = (a + c, b + d)                 (7.1.1)

and multiplication defined by

                          (a, b) x (c, d) = (ac - bd, ad + bc),              (7.1.2)

where a, b, c, and d are any real numbers.
    We will let i denote the complex number (0, 1). Then, by our definition of multiplica-
tion,
                     i2 = (0, 1) x (0, 1) = (0 - 1, 0 + 0) = (-1, 0). (7.1.3)


1


Copyright @ by Dan Sloughter 2000


﻿


2                    The Algebra of Complex Numbers                      Section 7.1


                          b_--------------,z=a+bi


                                                   a

              Figure 7.1.1 Geometric representation of a complex number


If we identify the real number a with the complex number (a, 0), then we have

                      ai = (a, 0) x (0, 1) = (0 - 0, a + 0) = (0, a).

Then for any two real numbers, we have

                            (a, b) = (a, 0) + (0, b) = a + bi.                (7.1.4)

That is, a + bi is another way to write the complex number (a, b). In particular, with this
convention, (7.1.3) becomes
                                      i2 = -1,                                (7.1.5)

that is,
                                             1.                               (7.1.6)

Moreover, we may write (7.1.1) as

                        (a + bi) + (c + di) = (a + c) + (b + d)i          (7.1.7)

and (7.1.2) as
                      (a + bi) x (c + di) = (ac - bd) + (ad + bc)i.        (7.1.8)

In fact, we may view the latter as a consequence of the ordinary algebraic expansion of
the product
                                   (a + bi)(c + di)

combined with the equality i2 =-1. That is,

            (a + bi)(c + di) =ac + adi + bci + bdi2 =(ac - bd) + (ad + bc)i.

It also follows from this formulation that if r is a real number, which we identify with
r + Oi, and z =a + bi is a complex number, then

                     rz =r(a + bi) =(r + 0i)(a + bi) =ra + r bi.          (7.1.9)


﻿


Section 7.1


The Algebra of Complex Numbers


3


    As indicated above, we let C denote the set of all complex numbers. Because of our
identification of C with the plane, we usually refer to C as the complex plane. Since the
description of complex numbers as points in the plane is often associated with the work of
Carl Friedrich Gauss (1777-1855) (although appearing first in the work of Caspar Wessel
(1745-1818)), C is also referred to as the Gaussian plane.
Example (3 + 4i) + (5 - 6i) = 8 - 2i.
Example (2 + i)(3 - 2i) = 6 - 4i + 3i - 2i2 = 8 - i.
Example -3(4 + 2i) = -12 - 6i.
    We have yet to define subtraction and division for complex numbers. If z and w are
complex numbers, we may define

                                 z - w = z + (-1)w.                           (7.1.10)

It follows that if z = a + bi and w = c + di, then

                    z - w = a + bi + (-c - di) = (a - c) + (b - d)i.      (7.1.11)

As a first step toward defining division, note that if z = a + bi with either a # 0 or b # 0,
then
                              a - bi     a2 - b2i2   a2 + b2
                    (a+b     Ka2+b2}      a2+ b2     a2+ b2
In other words,
                                        a - bi
                                        a2 + b2
is the multiplicative inverse, or reciprocal, of z = a + bi. Hence we will write

                                 z-   J      ab                               (7.1.12)
                                        z   a2+

Given another complex number w, we may define w divided by z by
                                     w
                                        =   z-1.                              (7.1.13)
                                      z

Definition Given a complex number z = a+bi, the number a - bi is called the conjugate
of z and is denoted z.
    Note that if z =a + bi, then

                             z- (a + bi)(a - bi) - a2 + b2.                   (114)

Hence
                                   z/ 2    a2 +b2,                            (7.1.15)
which is the distance in the complex plane from z to the origin.


﻿


4                    The Algebra of Complex Numbers                      Section 7.1

Definition   Given a complex number z, the magnitude of z, denoted z, is defined by


                                      z| =I vzz.                             (7.1.16)

    The magnitude of a complex number generalizes the idea of the absolute value of a
real number, and in fact reduces to the absolute value when z is a real number. Moreover,
note that if z = a + bi, with either a # 0 or b # 0, we may now write


                                     z- =      .                             (7.1.17)
                                        zzl

    Although (7.1.12) and (7.1.17) are useful expressions, in most situations the easiest way
to simplify a quotient of two complex numbers is to multiply numerator and denominator
by the conjugate of the denominator.
            2+i     2+il-i       2-2i+i+1        3-i    3    1.
Example                        =J=-= - - -z.
            1+i     1+i l-i          1+1          2     2    2

Definition Given a complex number z = a + bi, we call a the real part of z, denoted
J(z), and b the imaginary part of z, denoted 1(z).

    Because of this definition, we call the horizontal axis of the complex plane the real axis
and the vertical axis the imaginary axis. In this way we may identify the real number line
with the the real axis of the complex plane. A complex number of the form bi, where b
is a real number, lies on the imaginary axis of the complex plane and is said to be purely
imaginary. However, we should be careful with our interpretation of this terminology: a
purely imaginary number is just as "real", in the ordinary sense of real, as a real number, in
the same way that an irrational number is just as "rational", in the sense of reasonable, as
a rational number. Note that with this terminology, the conjugate z of a complex number
z is the point in the complex plane obtained by reflecting z about the real axis.

Example     If z = 3 + 6i, then R(z) = 3, (z) = 6, z = 3 - 6i, and

                             z|   v/9+36    v = 5=3v5.

    With the above definitions, we may work with the arithmetic and algebra of complex
numbers in the same way we work with real numbers. For example, for any complex
numbers z and w,


and
                                      zw =wz.

You will be asked to verify these and other standard properties of the complex numbers
in Problem 7 at the end of this section.


﻿


Section 7.1


The Algebra of Complex Numbers


5


                          y --------------- z=x+yi


                                      r               r sin(6)


                                      rcos(O)       x

                 Figure 7.1.2 Polar coordinates for a complex number


Polar notation
When we write a complex number z in the form z = x + yi, we refer to x and y as the
rectangular or Cartesian coordinates of z. We now consider another method of representing
complex numbers. Let us begin with a complex number z = x + yi written in rectangular
form. Assume for the moment that x and y are not both 0. If we let 0 be the angle between
the real axis and the line segment from (0, 0) to (x, y), measured in the counterclockwise
direction, then z is completely determined by the two numbers Iz| and 0. We call 0 the
argument of z and denote it by arg(z). Geometrically, if we are given Iz| and 0, we can
locate z in the complex plane by taking the line segment of length Iz| lying on the positive
real axis, with a fixed endpoint at the origin, and rotating it counterclockwise through an
angle 0; the final resting point of the rotating endpoint is the location of z. Algebraically,
if z = x + yi is a complex number with r = Iz| and 0 = arg(z), then

                                     x= r cos(0)                               (7.1.18)

and
                                     y = r sin(0).                             (7.1.19)

Together, r and 0 are called the polar coordinates of z. See Figure 7.1.2.
Example     If Iz|j= 2 and arg(z) = 6, then

                         z = 2 cos     + 2 sin    i =    + i.


Example     If z = 1 - i, then Iz| =2 and arg(z) =-7.
    Note that in the last example we could have taken arg(z) = , or, in fact,

                                  arg(z) =-- + 2nr
                                            4

for any integer nt. In particular, there are an infinite number of possible values for arg(z)
and we will let arg(z) stand for any one of these values. At the same time, it is often


﻿


6                    The Algebra of Complex Numbers                        Section 7.1

important to choose arg(z) in a consistent fashion; to this end, we call the value of arg(z)
which lies in the interval (-r, wr] the principal value of arg(z) and denote it by Arg(z). For
our example, Arg(z)=-
    In general, if we are given a complex number in rectangular coordinates, say z = x+yi,
then, as we can see from Figure 7.1.2, the polar coordinates r = Iz| and 0 = Arg(z) are
determined by
                                     r1 =  X2 +y2                              (7.1.20)

and
                                      tan(0) =,                                (7.1.21)

where the latter holds only if x # 0. If x = 0 and y # 0, then z is purely imaginary and
hence lies on the imaginary axis of the complex plane. In that case, 0 8= if y > 0 and
O   -2 if y < 0. If both x = 0 and y = 0, then z is completely specified by the condition
r = 0 and 0 may take on any value.
    Note that, since the range of the arc tangent function is (- 2, 2), the condition


                                      tan(0) =

only implies that
                                    0 =tan-1 ()

if x > 0, that is, if 0 is between -2 and 2.

Example     Suppose z = -1 - v/3i. Then

                                    z|     1+3=2

and, if 0 = Arg(z),

                                           -3
                                 tan(0) =- -1     v/ -

Since z lies in the third quadrant, we have

                                               27r
                                    Arg(z) =2.
                                                3

    Now suppose zi and z2 are two nonzero complex numbers with Izil  ri, Iz2|  2
arg(zi)   0 1, and arg(z2) 0 2. Then

                     zi=r1 cos(01) + r1 sin(01)i =r1(cos(01) + sin(01)i)

and
                     z2=r2 cos(02) + r2 sin(02)i =r2(cos(02) + sin(02)i).


﻿


Section 7.1


The Algebra of Complex Numbers


7


,Z2


20


Figure 7.1.3 Geometry of z and z2 in the complex plane


Hence

     ziz2 = (r1 cos(01) + r1 sin(01)i)(r2 cos(02) + r2 sin(02)i)
         = rir2(cos(01) cos(02) + cos(01) sin(e2)i + sin(01) cos(02)i - sin(01) sin(02))
           rir2(cos(01) cos(02) - sin(01) sin(02) + (sin(01) cos(02) + cos(01) sin(82))i)
         = rir2(cos(01 + 02) + sin(01 + e2)i).

It follows that
                                 ziz2 = rir2 = Iz lz2                          (7.1.22)
and
                        arg(ziz2) =0e1 + 02 = arg(zi) + arg(z2).               (7.1.23)
In other words, the magnitude of the product of two complex numbers is the product of
their respective magnitudes and the argument of the product of two complex numbers is
the sum of their respective arguments.
    In particular, for any complex number z, Iz2| = IZl2 and arg(z2)= 2 arg(z). More
generally, for any positive integer n,

                                      Iz z(7.1.24)

and
                                  arg(z") = narg(z).                           (7.1.25)

See Figure 7.1.3.
    If z is a complex number with Iz| =,r and arg(z) =0, then

                                z = r (cos(0) + sin(0)i)

and
                    z = r(cos(0) - sin(0)i) = r(cos(-8) + sin(-8)i).           (7.1.26)
Hence


|2| = lI|


(7.1.27)


﻿


8


The Algebra of Complex Numbers


Section 7.1


and


arg(s)


arg(z),


(7.1.28)


in agreement with our previous observation that z is obtained from z
the real axis.
    If zi and z2 are two nonzero complex numbers with Izil = ri, Iz21
and arg(z2) =082, then


by reflection about

= r2, arg(zi) =1,


z1
z2


z1z2
Z2 z2
rir2(cos(01 - 02) + sin(01 - 02)i)


               (2
£L (cos(01 - 02) + sin(01


02)i).


Hence


z2 Iz2|


and


(7.1.29)


(7.1.30)


arg (1 - 02.
  \rg Z2 /


In other words, the magnitude of the quotient of two complex numbers is the quotient of
their respective magnitudes and the argument of the quotient of two complex numbers is
the difference of their respective arguments.


Example


Let z = 2(cos(n) + sin(1)i) and w = 3(cos( ) + sin(6 )i). Then


zw= 6 (cos(1 +    + sin(j +

   = 6 (cos(4 + sin i
        ( 1        )


   =3     +3 2i.


Also,


z
w


3 2(o(1 _7T2  6 /
2
  3 cos(-1")+
2 / 7/Tr
3(cos (-  si

0.6440 - 0.1725i,


+ sin 7
      (12
sin (-
n i2


where we have rounded the real and imaginary parts to four decimal places.


﻿


Section 7.1


The Algebra of Complex Numbers


9


Figure 7.1.4 Powers of z


cos(}) + sin(})i


Example


Let


    (4)       (4)
cos     +sin


1     1


Since Iz|
one-eighth


  1 and arg(z) = }, z is a point on the unit circle centered at the origin,
of the way around the circle from (1, 0) (see Figure 7.1.4). Then


z2 = cos (2j-)+ sin (2j-)

z3 = cos (3 - 4-+sin (3j--)

z4   cos(4. 4) +sin(4. 4)

z5 = cos (5j4)+sin(5j4 )

z = cos(65j-)-+sin (65-j)
z7 = cos (6. -  + sin (6. -


z8 = cos (8. -  + sin (8. -


2 =

2 =

2 =

2 =

2 =

2 =


cos     +sin   (i7 =i

cos -3  + sin -3  i =


cos(7) + sin(7)i


1,


cos (547) +sin (57r) i

cos (I +sin (j3z)

cos (729\+sin (77)\i

cos(27) + sin(27)i = 1,


   1     1
   - + i
   2      2'


   1     1.
   2      2

-i,

1      1
2       2 '


and


z9 =z 8=()()=


Hence each successive power of z is obtained by rotating the previous power through an
angle of 4 on the unit circle centered at the origin; after eight rotations, the point has


﻿


10


The Algebra of Complex Numbers


Section 7.1


returned to where it started. See Figure 7.1.4. Notice in particular that z is a root of the
polynomial


P(w) = w8


1.


In fact, zn is a solution of w8 - 1 = 0 for any positive integer n since


(z -)8  1         -1)n _ _ n


1 = 1- 1 = 0.


Thus there are eight distinct roots of P(w), namely, z, z2, z3, z, z5,
two of which, z4A -1 and z8 = 1, are real numbers.


z6, z7, and z8, only


Problems


1. Evaluate the following if w = 3 - 4i and z
   (a) w + z
   (c) 3w - 2z
   (e) zw

   (g) z|I
   (i) !R(z - w)
2. Find the real and imaginary parts of each
       1
   (a) -
        3- 4i
   (c) -2+3
       -2 + 3z
3. For each of the following, write the given
   the complex plane.
                        wr


-2 + 7i.


  (b) w - z
  (d) w
  (f)1
      w
      z2_
  (h)

  (j) 9(3z + )
of the following.
        3
  (b) .+2
      1+2z
  (d) (1+i)3

z in rectangular coordinates and plot it in

                       27r
  (b) z|= 5, Arg(z) =2
                        3
  (d) |z| 2, Arg(z) = w


(a) z| = 3, Arg(z) = -

(c) |z= 0.5, Arg(z)


37
4


4. For each of the following, find Iz| and Arg(z) and plot z in the complex plane.


(a) z = -i
(c) z = 1+ i
(e) z = 2+ 2v 3i


(b) z
(d) z


-5
-1 - i


(f) z=v-i


5. Suppose w and z are complex numbers with |wl = 3, Arg(w) = 6, Iz| = 2, and
   Arg(z) =-. Find both polar and rectangular coordinates for each of the following.

   (a) w2                                     (b) z3


(c) wz


(d)
    z


﻿


Section 7.1


The Algebra of Complex Numbers


11


   (e) w2                                   (f) W5

6. Find all the roots of the polynomial P(z) = z6 -1 and plot them in the complex plane.
7. Let v = ai + bii, w = a2 + b2i, and z = a3 + b3i be complex numbers. Verify each of
   the following.
   (a) v +w =w +v                          (b) vw =wv
   (c) v(w+z) =vw+vz                       (d) (v+w)+z=v+(w+z)
   (e) v(wz) = (vw)z                        (f) (w + z)2 =2 + 2wz + z2
8. Suppose z is a complex number with Iz| = r and arg(z) =0.
   (a) Let w be a complex number with |w lr= V/ and arg(w) = 2. Show that w2 = z.
   (b) Let v be a complex number with |vl =V/r and arg(v) = +7r. Show that v2 = z.
   (c) From (a) and (b) we see that every nonzero complex number has two distinct
      square roots. Find the square roots, in rectangular form, of 1+ 3v/i and -9.


﻿


Section 7.2


                to                    The Calculus of Complex
       Differential Equations         Functions


In this section we will discuss limits, continuity, differentiation, and Taylor series in the
context of functions which take on complex values. Moreover, we will introduce complex
extensions of a number of familiar functions. Since complex numbers behave algebraically
like real numbers, most of our results and definitions will look like the analogous results
for real-valued functions. We will avoid going into much detail; the complete story of the
calculus of complex-valued functions is best left to a course in complex analysis. However,
we will see enough of the story to enable us to make effective use of complex numbers in
elementary calculations.
    We begin with a definition of the limit of a sequence of complex numbers.

Definition We say that the limit of a sequence of complex numbers {zn} is L, and write

                                     lim zn = L,
                                     n12 -*0

if for every c > 0 there exists an integer N such that

                                     zn - L <E

whenever n > N.

    Notice that the only difference between this definition and the definition of the limit of
a sequence given in Section 1.2 is the use of the magnitude of a complex number in place of
the absolute value of a real number. Even here, the notation is the same. The point is the
same as it was in Chapter 1: the limit of the sequence {zn} is L if we can always ensure
that the values of the sequence are within a desired distance of L by going far enough out
in the sequence.
    Now if z = Xn + yni and L = a + bi, then lim zn = L if and only if


                    lim zn - L = lim     (xn -a)2 + (yn - b)2 = 0,

the latter of which occurs if and only if lim xn = a and lim yn = b. Hence we have the
following useful result.

Proposition Let zn = xn + yni and L = a + bi. Then

                                     lim zn = L
                                     n-->o


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Calculus of Complex Functions


Section 7.2


if and only if
                              lim xn = a and lim y1 = b.

    Thus to determine the limiting behavior of a sequence {z1} of complex numbers, we
need only consider the behavior of the two sequences of real numbers, {R(z)} and {9(z)}.


Example Suppose


for n = 1, 2, 3,.... Then


3n - 1
2n+2


n+1.
n-1


lim R(z)
n-o


     3n-1
 lim
n-o 2n + 2


:lim        :
n-on - 1


      3-1
: lim 12
n-o2+ 2


      1+
 lim      1
 n-o1 - -
   12-00 _n


3
2


1,


and


lim s(z,,)
n-o


so


           3
 lim zn= - + i.
n-o        2


Example


Suppose


      1
z= -n


(cos -7


+sin -oJ
    + 3


for n = 1, 2, 3,.... Then


and


lim R(z)
n-o 00


lim 9(za)
n-o


     COS (-")
 lim 3
 n-o     n

     sin (-2)
 lim
n-o      n


:0


0,


so
                                       lim z1 = 0.

Geometrically, since Iz2 l =nand arg(zn) =  , the points in this sequence are converging
to 0 along a spiral path, as seen in Figure 7.2.1.
    Having defined the limit of a sequence of complex numbers, we may define the limit of
a complex-valued function, as in Section 2.3, and then define continuity, as in Section 2.4.
Definition Suppose f: C - C, that is, f is a complex-valued function of a complex
variable. We say the limit of f(z) as z approaches a is L, written

                                     lim f (z) = L,
                                     z-a

if whenever {z1} is a sequence of points with z1 # a for all n and lim z1 = a, then
                                                                 12-o 0


lim f (z) = L.
12- 00


﻿


Section 7.2


The Calculus of Complex Functions


3


l1


0.75


o.25


LL


-1 -0.75 -0.5  .5  0.25 0.5 0.75 1
             -0. 5


-0.51


-0.75


                                     -1
Figure 7.2.1 Plot of the points z =   (cos (n7,) + sin ( 3) i), n


1, 2, 3, . . . 20


Definition We


say the function f : C -- C is continuous at a if lim f(z)
                                               z -- a


f(a).


    As with real-valued functions of a real variable, it is easy to show that algebraic
functions of a complex variable are continuous wherever they are defined. In particular,
complex polynomials, that is, functions P of the form

                       P(z) = anza + an-iz"-- + -.-. + aiz + ao,

where n is a nonnegative integer and the coefficients ao, a1, ... , an are complex numbers,
are continuous at all points in the complex plane. Complex rational functions, that is,
functions R of the form
                                     R(z)    P(z)
                                             Q(z)'
where both P and Q are polynomials, are continuous at all points where they are defined.

Example Since f (z) = 3z2 - iz + 4 - 5i is a polynomial, it is continuous at all points in
the complex plane. In particular,

            lim f (z) = lim(3z2 - iz + 4 - 5i) = 3i2 - (i)(i) + 4 - 5i = 2 - 5i.


Example Algebraic simplification may be useful in evaluating limits here as it was in
Section 2.3. For example,


     z-i
lim z
z- z2 + 1


        z-i
lim     z
z-i- (z - i) (z + i)


      1
lim   +
z-i z + i2


1
2i


li       1
2ii      2


    Although this is not the time to go into any detail about the geometric meaning of the
derivative of a function f : C - C, the algebraic definition and manipulation of derivatives
follows the pattern of the results for real-valued functions in Chapter 3.


﻿


4


The Calculus of Complex Functions


Section 7.2


Definition  If f : C -- C, then the derivative of f at a, denoted f'(a), is given by


                             f',(a)  urn f(a + h) - f(a)                         (7.2.1)
                                     h-oo       h

provided the limit exists.
    Note that h in this definition is, in general, a complex number, not just a real number.
Since the algebraic properties of the complex numbers are very similar to the algebraic
properties of the real numbers, much of what we learned about differentiation in Chapter
3 still holds true in our new situation. For example, if n is a nonzero rational number,
then
                                     dz1 = nz--1.                                (7.2.2)
                                     dz
Moreover, all the techniques we learned for computing derivatives in Sections 3.3 and 3.4,
including the quotient, product, and chain rules, still hold.

Example If f (z) = 3z5 + iz3 - (3 + 2i)z, then

                              f'(z) = 15z4 + 3iz2 - 3 - 2i.


Example If
                                          (3+i)w2
                                   g(w)=2w-         ,
                                            2w-1

then, using the quotient rule,

                   _ (2w-1)(6+2i)w-(3+i)w2(2) _ (6+2i)(w2-w)
                                 (2w - 1)2                  (2w - 1)2

    From this point it is possible to follow the pattern of Chapter 5 and develop the theory
of polynomial approximations using Taylor polynomials, defined in a manner analogous to
the definition in Section 5.1, as well as the theory of power series and Taylor series. In
particular, a power series
                                     00
                                        an (z - a)",                             (7.2.3)
                                    n=0
where ao, ai, a2,... and a are complex numbers, is said to converge absolutely at those
points z for which the series
                                     00
                                     Z la||z - al"                               (7.2.4)

converges. Since the latter series involves only real numbers, its convergence may be
determined using the tests developed in Chapter 5. As before, absolute convergence implies
convergence. Moreover, if the series (7.2.3) converges at points other than a, then there
exists an R, either a positive real number or oc, such that the series converges absolutely


﻿


Section 7.2


The Calculus of Complex Functions


5


for all z such that Iz - al < R and diverges for all z such that Iz - a > R. However, note
that in this case the set of all points in the complex plane such that Iz - al < R is a disk
of radius R centered at a, not an interval as it was in the real number case.

Example Consider the power series


00
    z".
  n
Zz=0


(7.2.5)


Since the series
                                        00
                                        Zzn

is a geometric series, it converges for all values of z for which Iz| < 1. Hence

                                         00
                                            z"
                                        n=0

converges for all z for which Iz| < 1, that is, for all z inside the unit circle centered at the
origin of the complex plane. Thus the radius of convergence of (7.2.5) is R = 1. Using the
same argument as we used in Section 1.3, we can show that


00
    zT
n=o


  1
1-z


for all z with Iz| < 1. For example,


    ()2n
no


  1
1 _- 2
    2


  2
2-i


   2(2+ i)
(2 - i)(2+ i)


4 2.
5 5


Example Consider the power series
                                        00   m

                                        n=0

To determine its radius of convergence, we apply the ratio test to the series


(7.2.6)


n=o0


11 Zn
  nI.  '


(7.2.7)


obtaining


           z n+1

p = him   n+1
    n-oo  Izl"


    . Iz |
 lim        =0
n-oo n + 1


﻿


6                   The Calculus of Complex Functions                     Section 7.2

for all values of z. Since p = 0 for any value of z, (7.2.7) converges for all z in the complex
plane. That is, the radius of convergence of (7.2.6) is R =oc. Of course, we also know
that (7.2.7) converges for all z because, from our work in Section 6.1, it is equal to eZ.

    The power series in the last example is the extension to complex numbers of the series
we used to define the exponential function in Section 6.1. With it, we can define the
complex exponential function.

Definition The complex exponential function, with value at z denoted by exp(z), is
defined for all points in the complex plane by


                                  exp(z) .(7.2.8)
                                           n=0

    Of course, this definition agrees with our old definition when z is real.
    In Chapter 6 we used the exponential function to give meaning to exponents which
were not rational numbers. Similarly, the complex exponential function may be used to
define complex exponents. However, we will only consider the case of raising e to a complex
power.

Definition If z is a complex number with s(z) # 0, then we define ez = exp(z).

    With this definition we now have ez = exp(z) for all z in the complex plane, the case
when s(z) = 0, that is, when z is real, having been treated in Section 6.1. Although
we will not repeat them here, the arguments from Section 6.1 come over to establish the
following proposition.

Proposition    For any complex numbers w and z,

                                     eW+z   eez                                (7.2.9)

and
                                             ew
                                     ew-z =     .                             (7.2.10)
                                             ez

    Also, as in Section 6.1, direct differentiation of (7.2.8) yields the following result.

Proposition
                                       ez     z.                              (7.2.11)
                                       dz

Example Using the product and chain rules,

                 d (   z2\       2-      2         22
                 z2      z     z2(-2z)e- + 2ze-C       2z(1 - z)-z.


﻿


Section 7.2


The Calculus of Complex Functions


7


z = re'0


                Figure 7.2.2 Plot of the point re0 in the complex plane


    The exponential of a pure imaginary number is particularly interesting. To see why,
let 0 be a real number and consider

                   i8o     (i<)
                       n=0


        (iO)2 + (iO)4   (iO)5
1+ijO     +          +       +-
          2!     4!      5!
        02    i03  04    i05
1+i0--           +-+
         2!   3!    4!   5!
     e2    e4                gs3
     1- + .+--- +i 0- 3+
     2!    4!n3!)
cos(0) + isin(0).


5
5!


Proposition For any real number 0,


eie = cos(0) + isin(0).


(7.2.12)


    As a consequence, if 0 is a real number, then leialI= 1 and arg(ei0) =0. That is, e29
is a point in the complex plane on the unit circle centered at the origin, a distance of 0
radians away, in a counterclockwise direction, along the circle from (1, 0). Moreover, if z
is a nonzero complex number with Iz| = r and arg(z) = 0, then


z = r(cos(0) + i sin(0)) = reid.


(7.2.13)


This exponential notation provides a compact way to display any nonzero complex number
in polar form. See Figure 7.2.2.


Example If z = 1 - i, then Iz| =_v/2 and Arg(z)


}, so
4F


z =   2e-- .


﻿


8                   The Calculus of Complex Functions                     Section 7.2

Moreover,
                                     z=v 2e

and
                             z22-22 4 = 2e-2         -i
                                       2 -2       - -2i.

Example     If w = 3e2i and z = 5eg, then

                      mz =(3ei) (5ei) = 15e   +l = 15e2-
                           _z= ( e 3 g156g                  24

and
                           w    3e3     3 2   _)    3 25
                                        -e (3  8) - -e 24 .
                           z    5e8     5           5
    Since for any real number 0,

                                e28 = cos(0) + i sin(e),

it follows that
                     e-28 = cos(-8) + i sin(-8) = cos(0) - i sin(0).

Hence
          e - C-Ze= cos(0) + i sin(0) - (cos(0) - i sin(0)) = 2i sin(0).  (7.2.14)

Solving (7.2.14) for sin(0), we have


                               sin(0) =!2(i0 - O-

Similarly,
               e + e-Ze = cos(0) + i sin(0) + cos(0) - i sin(0) = 2 cos(0),   (7.2.15)

from which we obtain
                                        1         Z
                                cos(0) =(ei0 + e-
                                        2

Proposition    For any real number 0,

                                         1
                                sin(0) =  (ee - e-si)                         (7.2.16)
                                        2i

and
                                         1
                                cos(0) =-(e&a + e-UO).                        (7.2.17)
                                        2
    These formulas are very similar to the formulas we used to define the hyperbolic sine
and cosine functions in Section 6.7. We will now use these formulas to define the complex
sine and cosine functions; at the same time, we will extend the definitions of the hyperbolic
sine and cosine functions. In doing so, we will see just how closely related the circular and
hyperbolic trigonometric functions really are.


﻿


Section 7.2                    The Calculus of Complex Functions                      9

Definition  The     complex sine function, with value at z denoted by sin(z), and the
complex cosine function, with value at z denoted by cos(z), are defined for all z in the
complex plane by
                                          1.
                                 sin(z)     (eiz _ 6-iz)                        (7.2.18)
                                          2i
and
                                          1.
                                 cos(z) = -(eZz + e-iZ).                        (7.2.19)
                                          2
The complex hyperbolic sine function, with value at z denoted by sinh(z), and the complex
hyperbolic cosine function, with value at z denoted by cosh(z), are defined for all z in the
complex plane by
                                           1
                                 sinh(z) = -(ez - e-z)                          (7.2.20)
                                           2
and
                                           1
                                 cosh(z) = -(ez + e z).                         (7.2.21)
                                           2
    Note that these functions are defined so that they agree with their original versions
when evaluated at real numbers.
    With these definitions it is a simple matter to prove that


                                    d sin(z) = cos(z),                          (7.2.22)
                                    d z

                                  d
                                     cos(z)   - sin(z),                         (7.2.23)

                                  d
                                  z sinh(z) = cosh(z),                          (7.2.24)

and
                                  d
                                     cosh(z) = sinh(z).                         (7.2.25)

For example,

                              dcos(z) = (eiz + e-iz))
                              - COS            -Z) ( Z ) -,
                            dz dz (2


                                         2     -

                                         2i


                                       - -sin(z).


﻿


10                   The Calculus of Complex Functions                     Section 7.2

Example Note that
                                 sin(i) =!(i - e--2)
                                         2i

                                         = i(e--1 - el)
                                            -1 inh1)
                                         -- sinh(1)

                                           i
                                         i sinh(1).

Example Using the product and chain rules, we have

              d.
              z sin(2z) cos(3z) = sin(2z)(- sin(3z))(3) + cos(3z) cos(2z)(2)
                                  -3 sin(2z) sin(3z) + 2 cos(2z) cos(3z).

    The final complex-valued function we will define is the complex logarithm function.
Analogous to our other definitions in this section, we would like this function to share the
basic characteristic properties of the ordinary logarithm function and to agree with that
function when evaluated at a positive real number. In particular, if we let Log(z) denote
the complex logarithm of a complex number z and log(r) denote the real logarithm of a
positive real number r, then for a nonzero complex number z with Iz| =,r and Arg(z) =0
we would like to have

                 Log(z) = Log(rei) = Log(r) + Log(e20) = log(r) + i8.   (7.2.26)

Moreover, using (7.2.26) to define the complex logarithm function will guarantee that
our new function agrees with the ordinary logarithm function when evaluated at positive
real numbers, for if z is a positive real number, then Iz| = z and Arg(z) = 0, giving us
Log(z) = log(z).
Definition The complex logarithm function, with value at z denoted by Log(z), is defined
for all nonzero complex numbers z with Iz| = r and Arg(z) =0 by

                                 Log(z) = log(r) + i8,                         (7.2.27)

where log(r) is the ordinary real-valued logarithm of r.
    Note that we have used the principal value of arg(z), that is, Arg(z), in the definition
of Log(z) in order to give Log(z) a unique value. Moreover, note that this definition gives
meaning to the logarithm of a negative real number, although it still does not define the
logarithm of 0.
Example     Since |2 - 2il  8 / and Arg(2 - 2i) =-} , we have

                                         S  1         x     3
              Log(2 -2i) = log(v/8)- -i = -log(8) -  --log(2) - -i.
                                      4     2         4     2         4


﻿


Section 7.2

Example


                  The Calculus of Complex Functions

Since | - 4| = 4 and Arg(-4) =w7, we have


11


Log(-4) = log(4) + Sri


2log(2) +wSri.


Problems


1. For each of the following, find lim zn.
   plane.


Also, plot z1, z2, z3,..., zi5 in the complex


         3-n
(a) z =
           n

(c) zn = 3e n


   n+1.
+
  2n + 3


(b)   -


4+
  4 +-


           " 7r(n-
(dl zn = e2  n


2. Evaluate each of the following limits.

   (a) lim(4z3 - 6z + 3)
       z-2  

   (c) lim w2+9
       w-34w - 3i


(b) lim (z2 -
    z-1-2
        z--1
(d) lim z 1
    z2 + 1


3z)


3. Find the derivative of each of the following functions.


   (a) f(z) = 3z2 - 6z5 + 18i

   (c) f (z) = (z - 4i)e-z2

4. (a) Show that


(b) g(w) 13w - 6i + 3
               w+i
(d) h(s)   (s2 + 1) exp(3s2


si)


                                    1      0
                                      1  =    (-1)"z2n
                                 1+z2

    for all z with Iz| < 1.
(b) How does (a) help explain why, for real values of x, the Taylor series

                                    1      0
                                      1  =  (-1)"z2n
                                 1+ x2

    converges only on the interval (-1, 1)?


5. (a) If z = x + yi, show that


       and


R(ez)   ex cos(y)


(ez)    ex sin(y).


   (b) If z = x + yi, find lez| and arg(ez).

6. Show that e" + 1 = 0.


﻿


12


The Calculus of Complex Functions


Section 7.2


7. Verify the differentiation formulas for sin(z), sinh(z), and cosh(z).
8. (a) Show that
                                          f dx=log(2).


    (b) Some computer algebra systems evaluate the integral in (a) as


                                 f - dx =Log(-1) -Log(-2).


        Reconcile this answer with the answer in (a).
 9. Let z and w be complex numbers with -2 < Arg(z) < 2 and -2 < Arg(w) < 2
    Verify the following two properties of the complex logarithm.
    (a) Log(wz) = Log(w) + Log(z)
    (b) Log ()= Log(w) - Log(z)
10. For a positive integer n, an nth root of unity is a complex number z with the property
    that z" = 1. Show that for m = 0, 1, ..., n - 1,

                                        z)7 = e

    is an nth root of unity. Plot these points in the complex plane for n = 10.
11. (a) Use the fact that
                                    sin(z) =2 1 (ez_-iz)

        to find the complex power series representation for sin(z).
    (b) Use the fact that
                                    cos(z)     (eiZ + e-iz)
                                              2
        to find the complex power series representation for cos(z).
12. Define a complex version of the tangent function and show that

                                          1   eiz _ e-iz
                                 tan(z) =        7:2)z


13. (a) Show that sin(ix)   i sinh(x) for every real number x.
    (b) Show that cos(ix) =cosh(x) for every real number x.
14. Let z =x + yi.
    (a) Show that
                                   !R(sin(z)) =sin(x) cosh(y)


﻿


Section 7.2


The Calculus of Complex Functions


13


and


(b) Show that


9(sin(z)) = cos(x) sinh(y).


32(cos(z)) = cos(x) cosh(y)


        and
                                  9(sin(z))=- sin (x) sinh(y).

15. (a) Show that for any nonzero complex number z, eLog(z)=-
    (b) If z is a nonzero complex number, does it necessarily follow that Log(ez) = z?


﻿


                                      Section 7.3
        Difference Equations
                to                    Complex-Valued Functions:
       Differential Equations         Motion in the Plane


In Section 7.2 we considered the problem of extending the elementary functions of calculus
to complex-valued functions of a complex variable, while at the same time extending many
of the concepts of the first six chapters to these new functions. In this section we will
consider complex-valued functions of a real variable, that is, functions of the form f :R -
C. Such functions are often used to describe the motion of an object in the plane; if we
think of the real variable t as measuring time, then we may interpret f(t) as the location
of an object in the complex plane at time t.
    Since limits are at the foundation of most concepts in calculus, we begin with a defi-
nition of limit in this setting.
Definition   Suppose f : R -- C and f is defined for all t in an interval about the point
a. We say that the limit of f(t) as t approaches a is L, denoted

                                    lim f(t) = L,
                                    t->a

if whenever {tn } is a sequence of real numbers with to a for all n and

                                     lim to = a,
                                     n--> o

then
                                    lim f (tn) = L.
                                    n-> o
    Suppose f : R -- C. If we let x(t) = _R(f(t)) and y(t) = s(f(t)), then

                                  f (t) = x(t) + iy(t).

Hence, from our work in Section 7.2,

                            lim f(t) =lim x(t) + i lim y(t).                   (7.3.1)
                            t->a      t->a       t->a

This result also holds if we modify our definition of limit to include one-sided limits and
limits to oc or -oc.
Example Suppose a particle moves in the plane so that its position at time t is given
by
                          f (t) = cos(27t) + i sin(27t) = e2t
If we let C denote the unit circle centered at the origin, then f(t) is a point in the complex
plane on C, 27t units from (1, 0) in the counterclockwise direction along the circumference


1


Copyright @ by Dan Sloughter 2000


﻿


2


Complex-Valued Functions: Motion in the Plane


Section 7.3


t= 0.25


t = 0.5


t= 0, 1


t= 0.75


        Figure 7.3.1 Motion of a particle on the unit circle centered at the origin


of C. For example, at time t = 0 the particle is at f(0) = 1, at time t =  the particle
is at f ( ) = i, at time t = j the particle is at f (j)= -1, at t = j the particle is at
f (j)= -i, and at time t = 1 the particle is at f(1) = 1. Note that f has period1,
so the particle traverses C once in the counterclockwise direction as t goes from 0 to 1,
after which the particle will repeat this motion over every interval of time of length 1. See
Figure 7.3.1. As an example of a limit, we note that

         lim f(t) =  lim  cos(2t) + i lim  sin(2t) = cos (-  + i sin (-f = -i.

    Note that the path of the particle shown in Figure 7.3.1 is not the graph of the function
f, but rather a plot of f(t) for values of t from 0 to 1. In general, if f(t) = x(t) + iy(t)
represents the position of a particle moving in the complex plane at time t, we can obtain
a good representation of the path of the particle over an interval of time [a, b] by plotting
the points (x(t), y(t)) for a large number of points in [a, b] and connecting these points
with straight lines, similar to the procedure we used for plotting the graph of a function
in Section 2.1.
Example Suppose a particle moves in the plane so that its position at time t is given
by
                                z(t) = tanh(t) + isech(t).
Then 3R(z(t)) = tanh(t), 9(z(t)) = sech(t), and

                            z(t)|1=   tanh2(t) + sech2(t) = 1.


﻿


Section 7.3


Complex-Valued Functions: Motion in the Plane


3


t=0


t<0,


t>0


                 Figure 7.3.2 Motion on the upper half of the unit circle


Thus the particle is moving along the unit circle C as in the previous example. However,
since sech(t) > 0 for all t, the particle is always on the upper half of C. Moreover, z(0) = i,

                        lim z(t) = lim tanh(t) + i lim sech(t) = 1
                        t- oo      t- oo           t- o

and
                     lim  z(t) =  lim  tanh(t) + i lim  sech(t)   -1.

Combining these results with the fact that R(z(t)) = tanh(t) is an increasing function, we
see that as time flows from -oc to oc, the particle moves from left to right on the upper
half of C, coming from the point (-1, 0) as t increases from -oc and approaching the point
(1, 0) as t increases toward oc. See Figure 7.3.2.

    We may now define continuity and differentiability in analogy with our previous defi-
nitions.

Definition    We say a function f : R - C is continuous at a if lim f(t) = f(a).

    If x(t) - R(f(t)) and y(t) =   (f(t)), then

                                     lim.f (t) = f(a)

if and only if
                                     lim x(t) = x(a)
                                     te~a
and
                                     lim y(t) = y(a).

Hence f is continuous at a if and only if both x and y are continuous at a. For example,
the functions in the previous examples are both continuous for every t in (-oc, oc) since
the functions cos(t), sin(t), tanh(t), and sech(t) are continuous for all t in (-oc, oc).


﻿


4


Complex-Valued Functions: Motion in the Plane


Section 7.3


Definition    If f : R - C, then the derivative of f at a, denoted f'(a), is given by

                              f'(a)    urn f(a+ h) - f(a)
                                  f   h(a) hm                                      (7.3.2)

provided the limit exists.
    Now if f(t) = x(t) + iy(t), where x and y are both differentiable, then

                            .ff (t + h) - f (t)
                    f/(t) =  rm
                           h-0O        h
                           .  x(t + h) + iy(t + h) - (x(t) + iy(t))
                         = lhm
                           h-0                   h
                           Sx(t + h) - x(t)       limy(t + h) - y(t)
                           h-O         h           h-O        h
                         = z'(t) + iy'(t).

Hence differentiating a function f : R - C reduces to differentiating the real and complex
parts of f.
Proposition     If x :IR- I and y : R - R are differentiable and f(t) = x(t) +iy(t), then

                                  f'(t) = z'(t) + iy'(t).                          (7.3.3)

Example If f (t) =cos(t) + i sin(t), then f'(t) = - sin(t) + i cos(t).
    Of course, it is also possible to express a function f : R - C in polar form. If we let
r(t) = If(t)| and 0(t) = arg(f(t)), then r and 0 are real-valued functions and

                                     f (t) = r(t)eio(t).                           (7.3.4)

If r and 0 are differentiable, then

     f'(t) d   r   (t)
             d
          S-t(r(t) cos(0(t)) + ir(t) sin(0(t)))
             - r(t) sin(0(t))0'(t) + r'(t) cos(0(t)) + ir(t) cos(0(t))0'(t) + ir'(t) sin(0(t))
             r (t)0'(t)(- sin(0(t)) + i cos(0(t))) + r'(t)(cos(e(t)) + i sin(0(t)))

             = r(t)0'(t) ( sin(0(t)) + cos(0(t))) + r'(t)eio(t)

             = ir(t)'(t) (i sin(0(t)) + cos(0(t))) + r'(t)eio(t)
             =nir(t)0'(t)eio(t) + r'(t)eio(t).

    Note that this result is exactly what we would obtain if we treated i as a real constant
and differentiated (7.3.4) using the product and chain rules. Hence instead of remembering


﻿


Section 7.3               Complex-Valued Functions: Motion in the Plane                  5

the formula, we need only remember that we should differentiate a function given in polar
form using the product rule and treating i as we would any constant. In particular, taking
r(t) = 1 for all t, we have
                                   d
                                     e(t) = i6'(t)eio(t).                           (7.3.5)
                                   dt

Example      If f (t) = 4te"t2, then

                        f'(t) = 4t(2it)eit + 4eit2= (4 + 8it2)eit2.

    To understand the derivative geometrically, consider the setting where z(t) = x(t) +
iy(t) represents the position at time t of a particle moving in the plane. Then x'(t)
represents the velocity of the particle in the x direction and y' (t) represents the velocity
of the particle in the y direction. In other words, if all forces acting on the particle were
to cease at time to, then, according to Newton's first law, during the next unit of time
the particle would move in a straight line, x'(to) units in the x direction and y'(to) units
in the y direction. That is, in one unit of time the particle would travel along a straight
line from (x(to), y(to)) to (x(to) + x'(to), y(to) + y'(to)). Hence the particle would move,
in a straight line, from z(to) to z(to) + z'(to). Moreover, since the distance traveled in this
unit of time is Iz'(to)|, the speed of the particle at time to is given by Iz'(to)|. In short,
arg(z'(to)) tells us the direction in which the particle is moving at time to and Iz'(to)| tells
us the speed at which the particle is traveling at that instant. These considerations make
the next definition reasonable.
Definition If z(t) gives the position at time t of a particle moving in the plane, then we
call z'(t) the velocity of the particle and we call Iz'(t)| the speed of the particle.
    Notice that this definition is directly analogous to our treatment of motion along a
straight line in earlier chapters. In that case, if f(t) represented the position of a particle
moving along a straight line, then we called f'(t) the velocity of the particle and f'(t)
the speed of the particle at time t.
    Also notice that if for a given time to we draw an arrow from z(to) to z(to) + z'(to),
then this arrow points in the direction of motion of the particle at time to. Moreover, if
c'(to) 5 0, then this arrow has slope
                                          y'(to)
                                          X'(to)
Now, from the chain rule, we have

                                       dy _dy do


from which we obtain
                                           dy
                                    dy     dt    y'(t)
                                    dz     dzx     '()
                                           dt


﻿


6


Complex-Valued Functions: Motion in the Plane


Section 7.3


Figure 7.3.3 Motion in the complex plane


Hence the arrow from z(to) to z(to) + z'(to) points along the line tangent to the curve
of motion at z(to). Moreover, the length of this arrow is the speed of the particle at the
instant to
Example If the position of a particle moving in the plane is given by z(t) = 2ei at
time t, then the particle is traveling counterclockwise on the circle C of radius 2 centered
at the origin. The velocity of the particle is given by

                             z'(t) = ie2 = e -e = eZ2+)

and Iz'(t)    1. Hence at any time t, the particle is moving at unit speed with velocity
pointing in the direction of z(t) rotated counterclockwise through an angle of 2 . See Figure
7.3.4.
    Also in analogy with the one-dimensional case, if z(t) is the position of a particle
moving in the plane at time t, then the acceleration of the particle is given by z"(t), the
derivative of the velocity. Newton's second law of motion applies in this setting, telling us
that if the particle has mass m, then the force acting on the particle at time t is

                                     F(t) = mz"(t).

Hence the magnitude of the force acting on the particle is

                                     |F (t|| = m Iz" (t)|

and the force acts in the direction of an arrow pointing from z(t) to z(t) + z"(t).

Example In our previous example, position was given by


z(t) = 2e


﻿


Section 7.3


Complex-Valued Functions: Motion in the Plane


7


         Figure 7.3.4 Arrows indicating velocity and acceleration at time t = 2


and velocity by
                                     z'(t) = iel.
Thus the acceleration of the particle is
                                             1 it
                                   z"(t) -

Note that
                                             1
                                   z"(t) -    zt)
                                            4
showing that the force acting on the particle is directed toward the center of the circle C.
See Figure 7.3.4.
    Since we may compute the derivative of a complex-valued function by differentiating
its complex and real parts separately, it is reasonable to define the definite integral of such
a function in terms of the integrals of its real and complex parts.
Definition   Suppose f : R - C with f(t) = x(t) + iy(t). If x and y are both integrable
on the interval [a, b], then we define the definite integral of f over the interval [a, b] by

                           J bf (t)dt J cv (t)dt + f y(t)dt.                   (7.3.6)
                           Exablab

Example If
                              f (t) = sin(t) + i cos ,


﻿


8


Complex-Valued Functions: Motion in the Plane


Section 7.3


then


f(t)dt


f   sin(t)dt + i  cos ( ) Idt

- cos(t)  + 2i sin(;)

(1+1)+i(2-O)
2 + 2i.


    If F :R -   C and f :R -  C are continuous on [a, b], with F(t)
f(t) = x(t) + iy(t), and f(t) = F'(t) for all t in (a, b), then


X(t) + iY(t),


  b
f f (t) dt
a


Ia x(t)dt + i   y(t)dt
     b         b
x (t) + iY(t)
     a        a
(X(b) - X(a)) + i(Y(a) -
(X(b) + iY(b)) - (X(a) +
F(b) - F(a).


Y(b))
iY(a))


Hence we have a version of the Fundamental Theorem of Integral Calculus which may be
applied directly to complex-valued functions of a real variable.

Proposition    If F : R - C and f : R - C are continuous on [a, b] with f(t) = F'(t) for
all t in [a, b], then


                              jb f(t)dt = F(b) - F(a).


Example     If f (t) = e3t, then


f(t) dt


  J1 i
  eit    =1_e3it
0          3     0


1 (
3i


e0)   1


-1


2
3i


2
_3.
3


Example Integration by parts, which we derived in Section 4.5 from the product rule,
is applicable in our current situation. For example, to evaluate


'i


3teitdt,


we let


U =3t        dv =eit dt


du =3dt


  1
v  = -et


-ite.


﻿


Section 7.3


Complex-Valued Functions: Motion in the Plane


9


F


Figure 7.3.5 Motion of a projectile


Then
                               3tedt = -3tie      +     3ientdt
                            0o                  0   J0
                                        -37rie27 + 0 + 3eit
                                                           0
                                        -37ri(-1) + 3e"7U - 3e0
                                      -37i -3 - 3
                                        -6 +37ri.
    In our final example for this section, we consider the problem of finding the motion
of a projectile moving close to the surface of the earth. This problem will not only tie
together many of the concepts of this section, but it will also provide a preview of Section
7.4 and our discussion of differential equations in Chapter 8.
Example Suppose a projectile of mass m is fired from the surface of the earth at an
angle a, where 0 < a < 2. We will consider the motion of the projectile as a path in
the complex plane with its position at time t given by z(t). Further, we assume that its
initial position is z(0) = 0 and its initial velocity is z'(0) = v0. Ignoring the effects of air
resistance, the only force acting on the projectile during its flight is the force of gravity,
acting vertically downward. Hence at any time t the force is given by F = -mgi, where
g is the acceleration due to gravity (32 feet/second2 or 9.8 meters/second2). See Figure
7.3.5. Thus Newton's second law of motion gives us

                                     -mgi = mz"(t),

that is,
                                       -gi = z"(t),
at any time t. If we let v(t) be the velocity of the projectile at time t, then

                                   v' (t) = z" (t) = -gi.


﻿


10             Complex-Valued Functions: Motion in the Plane

Hence, by the previous proposition,


Section 7.3


v(t) - v(0) =    v'(s)ds


gids


    t
gis
    0


-gti.


Thus
                                    v(t) = -gti + vo.

Integrating again, we have, since z'(t) = v(t),


z(t) - z(0)


(-gsi +vo)ds(=- gis2 +vos)


t  2


Now if so =


vo , that is, if so is the initial speed of the projectile, then


v    soe   = so cos(a) + so sin(a)i.


Hence


z(t) = (so cos(a) + so sin(a)i)t


1g2i
~gt Z


so cos(a)t + (so sin(a)t


-gt   i.


Thus


R(z(t)) = so cos(a)t


and


9(z(t)) = so sin(a)t - -gt2.


That is, if we write z(t) = x(t) + iy(t), where x : R - R and y : R -- R, then


x(t) = so cos(a)t


(7.3.7)


(7.3.8)


and


    s            -12
y(t) - so sin(a)t - -gt2.


Note that x(t) gives the horizontal distance traveled at time t and y(t) gives the height
of the projectile above the ground at time t. For example, if the projectile is fired at an
angle of a = 6 with an initial speed of so = 50 feet per second, then its position at time t
is specified by
                                    x(t) = 25v 3t feet

and
                                 y(t) = 25t - 16t2 feet.


﻿


Section 7.3


Complex-Valued Functions: Motion in the Plane


11


10


10     20     30     40      50     60
Figure 7.3.6 Motion of a projectile


A plot of this motion is shown in Figure 7.3.6.
    In the next section we will consider a more complicated motion problem, namely, the
two-body problem, the problem of determining the orbit of a planet about its sun.

Problems

1. For each of the following, suppose the given function specifies the position of a particle
    moving in the complex plane. Plot the path of the motion over the given time interval,
    indicating the direction of motion with arrows on the curve.
    (a) f (t) = cos(2t) + i sin(2t), 0 < t < r
    (b) z(t) = 4 cos(t) + i sin(t), 0 < t < 27


(c) g(t) = sin(2I + i cos (  , 0 < t < 37r

(d) z(t) = sech(2t) + i tanh(2t), -o < t < o0
(e) f(t) = 2t +it2, -4 < t < 4
(f) g(t) = t2 + it4, -2 < t < 2
(g) z(t)   3t, -7r <t < r
(h) h(t) = 3tet, 0 < t < 67


(i) z(t)


32t
e 2i, 1 <t<20
t


2. Differentiate each of the functions in the previous problem.
3. For each of the following, suppose the given function specifies the position of a particle
   moving in the complex plane. Find the velocity, speed, and the acceleration for each
   at the specified time.

   (a) z(t) = cos(2t) + i sin(2t), t =-
                                    6


﻿


12


Complex-Valued Functions: Motion in the Plane


Section 7.3


   (b) f (t) = 3 sin (t) + i cos(2t), t = 7
   (c) z(t) = tanh(t) + isech(t), t = 3
   (d) h(t) = 4t2 + i(4t - 1), t = 1
   (e) z(t) = 5e", t = -

   (f) f(t) = 4t2 2, t ,5

4. Evaluate the following integrals.

   (a) f(2t + it)dt                             (b) f  (sin(t) + i cos(3t))dt
                   0 0

   (c) f   (-3sin(2t)+ it)dt                    (d) f   5etdt

   (e) f   2te3itdt                             (f) ft2etdt
        0                                            0
5. Suppose z(t) specifies the position at time t of a particle moving in the complex plane.
   If we know z(O) = 1+ i and z'(t) =cos(t) + i sin(t), find z(t) and plot the path of the
   object for 0 < t < 2w.

6. In the last example of this section we saw that if a projectile is fired from the surface
   of the earth at an angle a, 0 < a < 2, with an initial speed of so feet per second, then
   the x and y coordinates of its position after t seconds are given by

                                       x = s0 cos(a)t

   and
                                   y = s0 sin(a)t - 16t2.

   (a) Find the time t at which the projectile strikes the ground.
   (b) The range R of the projectile is the value of x when the projectile strikes the
       ground. Use your result from (a) to find R.
   (c) Show that R is maximized when a = }
   (d) Solve the first equation for t in terms of x and substitute this result into the second
       equation to show that path of the projectile is a parabola.
7. A projectile is fired from the surface of the earth at an angle a, 0 < a < j, with an
   initial speed of 150 feet per second.

   (a) Using the results of the previous problem, find the maximum range for the projec-
       tile.
   (b) What is the range of the projectile if a =  ?If a =  ?In each case, when does
       the projectile strike the ground?
   (c) Plot the path of motion for a =   ,a =  , and a =}


﻿

Section 7.3             Complex-Valued Functions: Motion in the Plane               13

8. Suppose a particle moves in the complex plane so that its position at time t is given
    by z(t) =cx(t) + iy(t), where

                                  xc(t) =fcos (-2 dS

    and
                                  y(t) f sin      -   dS.

    (a) Plot the path of motion for -5 < t < 5, indicating the direction of motion with
        arrows on the curve.
    (b) Find the velocity and acceleration of the particle.
 9. If f :IR -~ R is such that
                                       Jf (t)|dt <oo,

    then the function
                                      p(A) -f f(t jeudt

    is called the Fourier transform of f.
    (a) Show that


    (b) Show that


    (c) Show that


    (d) Show that
                                      (p 1 2  ( 0  t t f ( t ) d t

        for nr= 0, 1, 2,..
10. Let
                                     f (t) for t > 0,

    (a) With reference to the previous problem, show that the Fourier transform of f is

                                                 1i


﻿


14


            Complex-Valued Functions: Motion in the Plane

(b) Use the results from (a) and Problem 9 to evaluate


                                           tJe-dt
                                       0o


Section 7.3


        for n = 0,1, 2,3,4.
    (c) For s > 0, the function
                                     F(s) =      ts-le-tdt

        is called the gamma function. Show that

                                         F(n +1) =n!

        for n = 0, 1, 2, ....
11. With reference to Problem 9, find the Fourier transform of


f(t)


   t2
e  2


and use it to evaluate


   t~e- 2
]the 2dt


for n = 0, 1, 2, 3, 4.


﻿


Section 7.4


       Differential Equations         The Two-Body Problem


In 1609 Johann Kepler (1571-1630) published the first two of his three laws of planetary
motion. The first of these states that the orbit of a planet about the sun is an ellipse with
the sun at one focus. He had reached this conclusion after painstaking analysis of the data
Tycho Brahe (1546-1601) had collected from observing the motion of Mars over a period
of more than 20 years. His work was a scientific triumph because it created a model for the
solar system that was not only more accurate than the models of Copernicus and Ptolemy,
but simpler as well. Yet, however brilliant, Kepler's result amounted to fitting a curve
to a set of data without discovering any fundamental principles underlying the motion of
planets that would cause their orbits to be as we observe them. In 1687 Newton provided
the missing principles. In his great work, Philosophiae naturalis principia mathematica,
Newton demonstrated that the elliptical orbit of a planet is a consequence of his three laws
of motion and the inverse square law of gravitation. Hence the behavior of the planets
could be explained by the same laws which govern the path of an apple as it falls from a tree
to the ground; for the first time it became clear that the so-called heavenly bodies behaved
no differently than the seemingly more substantial bodies of our everyday experience.
    In this section we will see how the motion of the planets may be explained using only
Newton's laws and tools from our study of calculus. The solution of this problem is one
of the greatest triumphs of the human intellect in general and of calculus in particular.


                                                  P

                                          Sr(t)
                                            S   e(t)


                Figure 7.4.1 Possible orbit of a body P about a body S


    To begin, suppose we have two bodies, one of mass m, which we denote by P, and the
other of mass M, which we denote by S. We may think of S as representing the sun and


1


Copyright @ by Dan Sloughter 2000


﻿


2                         The Two-Body Problem                             Section 7.4

P as representing a planet. It is possible to show that Newton's laws of motion hold in
a coordinate system with the origin located at the center of mass of the two bodies; for
simplicity, we will assume that M is significantly larger than m (as it is if S is the sun and
P is a planet, asteroid, or comet), allowing us to assume that the center of mass is located
at S. Thus we choose a coordinate system for the complex plane so that S is at the origin
and we let z(t) represent the position of P with respect to S at time t. If we express z(t)
in polar coordinates, then
                                    z(t) = r(t)eio(t),                           (7.4.1)

where r and 0 are real-valued functions, as shown in Figure 7.4.1. For simplicity of notation,
we will usually drop the explicit reference to t and simply write

                                       z = rei .                                 (7.4.2)

By Newton's law of gravitation the magnitude of the gravitational force of attraction
between the two bodies is
                                           GMm
                                     |F =     2                                  (7.4.3)

where G is a constant, approximately

                                                 m2
                                  6.67 x 10-11 N m
                                                kg2

if we measure force in Newtons, distance in meters, and mass in kilograms. Since gravity
is an attractive force and we are assuming S to be at rest at the origin, F is directed from
P toward the origin. Hence we have

                                          GMm Z
                                   F=        2 e                                 (7.4.4)

Moreover, we assume that this is the only force acting on the two bodies. Now if v(t) and
a(t) represent the velocity and acceleration, respectively, of P at time t, then, by Newton's
second law of motion, F = ma, we must have

                                           GMm Z
                                   ma= - G2 e.                                   (7.4.5)

Letting k =GM, this simplifies to

                                            k
                                      a =-2 el.                                  (7.4.6)

    From our work in Section 7.3 we know that

                               dz    d           0d0     *dr(74)
                          vo        -rei =frei       +6e(.47
                               dt   dt            dt      dt


﻿


Section 7.4


The Two-Body Problem


3


and


    dv
a   dt

      dt dt -
         d (d 8
  = ire  Kdt


-e dr
e
   dt
   d+ d
   dt dt


  ieed2e
ire edt2

   2X d2
ire d2


  dO
+dt


2re e


rejo


de 1
dt J


2


ire0) + es
          d-
18 +Z"dr
it      dt!

    .0d~dr
      dt dt

        Z d20  


   dr  drd
t  dt  dt dt
    Z0d2r   dr
      dt2   dt


+ e  d2+ ie
     dt r

     + d 2 dr
     +  edt dt}


(e20)

i0d8
   dt
d1 dr
dt dt


-re(dOd\2
   -eKdt)


     d2r
+ edt2


(7.4.8)


Putting (7.4.6) and (7.4.8) together gives us


k
2ej


      d_ 2
re e (dt \


     d2r
+ edt2


     *( 0d20      d~dr\
+ i      dt2 + 2eie dt dt)'


After dividing through by eie we have


k       (d8"\2
        2dtJ


d2r     (d20
dt2 + iZ r'dt2


   dO dr
+2 dt
   dt dt


(7.4.9)


    The equality in (7.4.9) implies that the the real part of the left-hand side of the equation
is equal to the real part of the right-hand side of the equation and the imaginary part of
the left-hand side of the equation is equal to the imaginary part of the right-hand side of


the equation. That is,


and


k       (d 2 d2r
     -r +
     r2 dtJ dt2


(7.4.10)


(7.4.11)


d20
dt2


   dO dr
+2       .
    dt dt


Multiplying both sides of (7.4.11) by r gives us


     d20      d8 dr
0=r dt2        dt dt


(7.4.12)


However,


2 de 1


2 d20
dt2


    dO dr
+ 2r dt dt


﻿


4


The Two-Body Problem


Section 7.4


so (7.4.12) implies that


d2d        = 0.


(7.4.13)


Since a function with 0 for its derivative must be a constant function, it follows that


2d0
2 d
dt


(7.4.14)


for some constant c. In any interval of time of interest, we will have r > 0, that is, S
and P are not a the same point in space, and so r2 > 0. It follows that if c = 0, then
dO
dt    0 for all t, corresponding to the relatively uninteresting case when 0 is a constant
and P moves along a straight line passing through S. The more interesting cases are when
                                                  dO
c < 0 or c > 0. Since the former case implies that   < 0 for all t and the latter implies
                                                  dt
dO
   > 0 for all t, the choice of sign for c ultimately depends on our choice of orientation in
dt
our coordinate system, that is, the direction in which we measure positive angles. Hence,
                                                               dO
without loss of generality, we may assume c > 0 or, equivalently,  > 0.
                                                                di
    We will now use the substitution s    to put (7.4.10) into a simpler form. With this
substitution, r =, so


dr
dt


d
dt (s)


1 ds
s2dt


1 ds dO
s2dOdt


(7.4.15)


Since, from (7.4.14),


we have


Differentiating again,


dO
dt


   = cs2,
Tr2

    ds
    dO


(7.4.16)


(7.4.17)


dr
dt


d2r    d(
dt2 dtK


  ds)
.c d


  d (ds
-c-I
  dt \dOJ


  d   ds   dO
CdO KdO} dt


  dO d2s
-c .d
  dt dO2


(7.4.18)


Hence, using (7.4.16),


Finally, substituting (7.4.16),


      d2r _    2s__
      dt2         dO2
(7.4.19), and s =  into (7.4.10) gives us


(7.4.19)


   1 _ --            d 2 s
-2 = '(cs2)2 -c22 s
       s             dO2


  2 3   2 2d28
-C2S -C2S dO2.


(7.4.20)


﻿


Section 7.4                          The Two-Body Problem                             5

Dividing both sides of this equation by -c2s2, we have

                                     d2s        k
                                     dO  +  s      .(7.4.21)
This is the differential equation to which all our work has been leading. The solution of
this equation will be an expression for s as a function of 0; since r is in turn a function of
s, namely, r =s, this will give us r as a function of 0 and allow us to determine the path
of motion of P. Note, however, that we will not have found r as a function of t. In other
words, we will be able to determine the path of motion of P, but we will not be able to
determine where along that path P is at any specific time t.
    To solve (7.4.21), we first note that if y(O) is a solution of the equation

                                      d2y
                                      dO2 + y= 0,
then the function
                                                  k
                                    x(O) = y()+2
satisfies the equation
                                     d2x         k

since
                d2       k            k     d2y                  k    k
                d82(Y+c2)+(Y+C2)d2+Y+C2                       + 2     c2.
Hence to solve (7.4.21), we need only solve the equation

                                      d2s
                                      d   + s = 0.                              (7.4.22)

That is, we need only find a function s of 0 such that

                                       d2s
                                       d      -s.                               (7.4.23)

Now (7.4.22) simply says that s is a function with the property that its second derivative
is the negative of itself. But we already know two such functions, namely, sin(0) and
cos(O); moreover, for any constants A and B, the function A sin(O) + B cos(0) also has this
property. Although the justification is beyond our resources at this point, it is in fact true
that any solution of (7.4.23) must be of the form

                                   A sin(O) + B cos(O)                          (7.4.24)

for some constants A and B. From this it now follows that our sought after solution to
(7.4.21) must have the form

                              s =A sin(O ) + Bcos(OB) +  2~                     (7.4.25 )

for some constants A and B.


﻿


6


The Two-Body Problem


Section 7.4


We will now find values for the constants A and B so that


ds
dO o=o

d2s
dO2 0-o


0


(7.4.26)


and


<0.


(7.4.27)


Intuitively, this means we are looking for values which satisfy conditions for s to have a
local maximum at 0 = 0. Equivalently, these conditions will hold if r has a local minimum
at 0 = 0. We make think of this as choosing the constants A and B in such a way that P
is closest to S when the path of P crosses the positive real axis. Now


ds
dO


A cos(0) - B sin(0),


so


ds
dO o=o


A,


(7.4.28)


and


d2s
dO2


A sin(0) - B cos(0),


so


d2s
d2 BoO


B.


(7.4.29).


Hence the conditions (7.4.26) and (7.4.27) are satisfied if we set A = 0 and we require
B > 0. In other words, the conditions (7.4.26) and (7.4.27) are satisfied by


               k
s   Bcos(O) +2,
               c2


(7.4.30)


where B > 0.
    In terms of r, (7.4.30) gives us


so


1               k
-    Bcos(0) +2


          C2
r=
    c2B cos(O) + k


c 2B cos(0) + k
       2
       c

       C2

    c2B
1 +   cos(O)


(7.4.31)


If we let a =   and =aB, then our expression for r as a function of 0 reduces to

                                   r =   +      O(7.4.32)
                                        1 + E cos(0)'


where c> 0 and a > 0 are constants.


﻿


Section 7.4                         The Two-Body Problem                            7

                                          3


                                          1


                           -3    -    -1        1         3
                                         -1-


                                         -3-

                Figure 7.4.2 Circular orbit for P when Ec= 0 and a = 2


    Note that, as indicated above, our solution does not give us values of r and 0 for
specified values of t, but rather (7.4.32) gives us a value of r for any specified value of 0.
In other words, our solution does not give us the position of P for a given time t, but it
does tell us the location of P as a function of 0. Indeed, if we plot the points z = resi
for all values of 0 in the interval [-o, w], with r given by (7.4.32), then the resulting curve
will be the path of the orbit of P about S. For example, if c = 0, then r = a for all t and
the orbit of P is a circle of radius a with center at S, as shown in Figure 7.4.2 for a = 2.
                                          dO
Note that because of our assumption that  > 0, the motion along this curve, and all
                                          dt
subsequent curves, will be in the counter-clockwise direction.
    If 0 < E < 1, then c cos(0) has a maximum value of c when 0 = 0 and a minimum value
of -c when 0 = -7 or  w= . Thus the minimum value of r is

                                     r(0)     a
                                            1 +

and the maximum value of r is

                                r(-7) = ra()
                                                1-E

Hence the orbit of P about S is a closed curve with
                                    a           a


for all 0. An example for a =2 and e   0.5, in which case j < r < 4 for all 0, is shown in
Figure 7.4.3.
    Note that
                              lim r (0) =lim    a      -
                                esi- si- 1+6 2'
whereas
                              lim r(7r) =lim    a      0
                                esi- es -- 1 - E


﻿


8


The Two-Body Problem


Section 7.4


3


1


-5   -    -3   -2    -1
                        -1


                        -3


1    2


Figure 7.4.3 Orbit of P for c= 0.5 and a = 2


Hence as c approaches 1 from the left, the point of closest approach of P to S shrinks
toward 2, but the point at which P is farthest from S increases without bound. Thus, as E
varies from 0 to 1, the orbit of P flattens out, changing from a circle to a long oblong shape.
Figure 7.4.4 shows the orbit of P for a =2 and c= 0.95, in which case 1.026 < r < 40 for
all 0 . Because of this behavior, c is called the eccentricity of the orbit of P.

                                                                15 r


10 |


5


--30              -20        -10
                                      -5


1


10 F


15


Figure 7.4.4 Orbit of P for c= 0.95 and a = 2


When Ec= 1, r is not defined for 0


-7r and 0 = fr. In fact, in this case


lim   r(0) =  lim      a
0 - -         - i- 1 + cos(0)


DO


and


lim r(0)
o--7+


  lim      a
--ir+ 1 + cos(0)


00.


﻿


Section 7.4


The Two-Body Problem


9


                                                           15

                                                           10


                     -40       -30       -20       -10


                                                          -10

                                                          -15

                      Figure 7.4.5 Orbit of P for Ec= 1 and a = 2


Hence the orbit of P is not closed; P makes its closest approach to S when 0 = 0, at which
point the distance from P to S is 2, and then follows a path which takes it ever farther
away from S. The situation for a = 2 and c= 1 is shown in Figure 7.4.5.
    For E > 1, there are angles 01 and 02, with

                                   -7r < O1 <

and
                                    -<02< 7r

such that
                               cos(01) = cos(02)     .                       (7.4.33)

Whenever -7r   0 0<61 or 02  < wr we have 1+ cEcos(0)  <0. Since a > 0 and r > 0 for
all 0, the orbit of P in this case is defined by (7.4.32) only when 01 < 0 < 02. Moreover,

                          lim r(0) = lim       a      =o0
                          e-e+ -le+ 1 + E cos(0)

and
                          lim r(0) =  lim      ao
                          0 - 20 e21 + E cos(0)
Thus again the orbit of P is not closed; P approaches S to within a distance of   at
0   0 and then follows a path away from S. See Figure 7.4.6 for the case a =2 and c  2.
    The curves in Figures 7.4.3 through 7.4.6 should look familiar. Indeed, the curves in
Figures 7.4.3 and 7.4.4 are both ellipses, the curve in Figure 7.4.5 is a parabola, and the
curve in Figure 7.4.6 is a hyperbola. This is not hard to see if we rewrite the equation


r
    1 + Ecos(9)


(7.4.34)


﻿


10                       The Two-Body Problem                          Section 7.4

                                              40


                                              20


                                 -40 -30 -20 -10

                                             -20


                                             -40

                     Figure 7.4.6 Orbit of P for c= 2 and a = 2


in rectangular coordinates. Recall that if x and y are, respectively, the real and imaginary
parts of z = rei, then


and
                                cos(O) =      2
                                           xc2 + y2

Hence if z is a point on the curve with equation (7.4.34), we have


                                        22a       a    2 +y2
                                1X+               c2+y2 + ex
                                      2 + y2


Dividing both sides by cc2 + y2 gives us

                                1 =       a
                                      x2 + y2 + c'

and so
                                  X2 + y2 = a - Ems.

Squaring, we have
                            cc +y2 = a2 -2ac+c cc,2

from which we obtain
                          (1 - c2)cc2 + y2 + 2accc - a2 =0.                (7.4.35)

Thus if the polar coordinates of z satisfy (7.4.34), then the rectangular coordinates of z
must satisfy (7.4.35). Moreover, we know from analytic geometry that a curve in the plane
with equation


ax2 + bcy + cy2 + dc+ ey + f =0,


(7.4.36)


﻿


Section 7.4


The Two-Body Problem


11


where a, b, c, d, e, and f are all constants, is an ellipse if b2 - 4ac < 0, a parabola if
b2 - 4ac = 0, and a hyperbola if b2 - 4ac > 0. Because of this result, we call the number
                                     D = b2 - 4ac                               (7.4.37)
the discriminant of (7.4.36). In the case of (7.4.35), we have
                            D =0 - 4(1 - E2)     -4(1 - E2).                    (7.4.38)
Thus D < 0 when 0 < E < 1, D = 0 when c= 1, and D > 0 when c> 1. Since we have
already seen that the orbit of P is a circle when c= 0 (a circle being a particular case of
an ellipse), we now have the following classification of the orbit of P about S in terms of
the eccentricity c:

                      Eccentricity          Orbit of P
                      E= 0                  Circle
                      0 < E6< 1             Ellipse
                      E = 1                 Parabola
                      Ec> 1                 Hyperbola
Recall that, collectively, these curves are known as the conic sections.
    We have seen that starting with the assumptions of Newton's law of gravitation and
his second law of motion, we may conclude that the orbit of a body P about another
body S must be a conic section. As great as Newton's accomplishment was, scientifically,
mathematically, and philosophically, it is not the end of the story. The work we have
done only accounts for the interaction of two bodies, isolated without any forces acting
on them other than their mutual gravitational attraction. In reality, to model our entire
solar system we would have to consider, at the minimum, the effects of the gravitational
fields of the sun plus at least nine planets, as well as numerous moons, asteroids, and
comets. Because of these other considerations, the orbits of the planets are not true
ellipses, although, since by far the most dominant force acting on any one planet is the
gravitational attraction between it and the sun, the deviation from elliptical paths is small.
The problem of the motion of three or more bodies interacting under the influence of their
mutual gravitational attraction has challenged mathematicians since the time of Newton.
However, we now know that this problem, known as the n-body problem, cannot, in general,
be solved exactly. Since the work of Henri Poincar6 (1854-1912), advances on this problem
have been directed toward qualitative and numerical descriptions of the orbits, not toward
exact analytic solutions. In fact it was Poincar6 who first showed that even in the case
of only three bodies, the orbits can be highly complex, revealing a sensitivity to initial
conditions that would make predictions about the future path of a given body effectively
impossible. The work on this problem continues to the present.

Problems

1. The perihelion of the orbit of a planet is the point of the orbit which is closest to
    the sun. The following table gives the eccentricity and the distance from the sun at
    perihelion for each of the known planets in our solar system. Note the distances are
    given in astronomical units, where one astronomical unit is approximately 92.9 million
    miles, the mean distance from the earth to the sun.


﻿


12


The Two-Body Problem


Section 7.4


                Planet           Eccentricity      Distance at Perihelion

                Mercury               0.21             0.31
                Venus                 0.01             0.72
                Earth                 0.02             0.98
                Mars                  0.09             1.38
                Jupiter               0.05             4.95
                Saturn                0.06             9.02
                Uranus                0.05             18.3
                Neptune               0.01             29.8
                Pluto                 0.25             29.8

   (a) Plot the orbits of each of the planets.
   (b) The aphelion of the orbit of a planet is the point of the orbit which is farthest from
       the sun. Find the distance of each planet from the sun at aphelion.
   (c) Which orbits are closest to being circular? Which ones deviate the most from being
       circular?
   (d) Plot the orbits of Neptune and Pluto together. How do they differ?

2. The orbit of the Comet Kohoutek has an eccentricity of 0.9999 and its distance from
   the sun at perihelion is 0.14 astronomical units. Plot the orbit of Comet Kohoutek
   and compare it with the orbit of Pluto from Problem 1. How far away from the sun is
   Comet Kohoutek at aphelion?

3. The orbit of Halley's comet has an eccentricity of 0.967 and its distance from the sun
   at perihelion is 0.59 astronomical units. Plot the orbit of Halley's comet and compare
   it with the orbits of Pluto and Comet Kohoutek as found in Problems 1 and 2. How
   far away from the sun is Halley's comet at aphelion?

4. The orbit of Encke's comet has an eccentricity of 0.847 and its distance from the sun
   at perihelion is 0.34 astronomical units. Plot the orbit of Encke's comet and compare
   it with the orbits of Pluto, Comet Kohoutek, and Halley's comet as found in Problems
   1, 2, and 3. How far away from the sun is Encke's comet at aphelion?

5. (a) Use the information in Problem 1 to find the equation for the orbit of the earth in
       rectangular coordinates (that is, an equation of the form (7.4.35)).
   (b) Use your result from (a) and the techniques of Section 4.8 to find the length of the
       earth's orbit. Convert your answer into miles.
   (c) What is the average speed of the earth in miles per hour?

6. (a) Use the information in Problem 1 to find the equation for the orbit of Pluto in
       rectangular coordinates (that is, an equation of the form (7.4.35)).
   (b) Use your result from (a) and the techniques of Section 4.8 to find the length of
       Pluto's orbit. Convert your answer into miles.
   (c) What is the average speed of Pluto in miles per hour? You will need to know that
       it takes Pluto 248 years to complete one orbit about the sun.


﻿


Section 7.4


The Two-Body Problem


13


7. To solve the two-body problem we had to solve a differential equation of the form


d2y


-y.


In this problem we consider the equation


d2y
dt2


y.


(7.4.39)


(a) Find two functions, yi1(t) and y2 (t), which satisfy (7.4.39) and are such that y2 (t)
    is not a constant multiple of y1 (t).
(b) Show that
                                y(t) = Ayi(t) + By2(t)


    satisfies (7.4.39) for any constants A and B.
(c) Find a solution y(t) of (7.4.39) such that y(O)


2 and


4.


﻿


Section 8.1


                to                     Numerical Solutions of
       Differential Equations          Differential Equations


If x is a function of a real variable t and f is a function of both x and t, then the equation

                                   i(t) = f (z (t), t)                          (8.1.1)

is called a first order differential equation. Solving such an equation involves more than
algebraic manipulation; indeed, although the equation itself involves three quantities, x,
i, and t, to find a solution we must identify a function x, defined solely in terms of
the independent variable t, which satisfies the relationship of (8.1.1) for all t in some
open interval. For many equations, exact solution is not possible and we have to rely on
approximations. In this chapter we will discuss techniques for finding both approximate
and, where possible, exact solutions to differential equations.
    We have already seen many examples of differential equations: in Section 4.8 when
we discussed finding the position of an object moving in a straight line given its velocity
function and its initial position, in Section 6.3 when we discussed models for growth and
decay, in Section 7.3 when we discussed the motion of a projectile, and in Section 7.4
when we considered the two-body problem. Indeed, in many ways the study of differential
equations is at the heart of calculus. To study the interaction of physical bodies in the
world is to study the ramifications of physical laws such as the law of gravitation and
Newton's second law of motion, laws which frequently lead, as we saw in Section 7.4, to
questions involving the solution of differential equations. Newton was the first to realize
the power of calculus for solving a vast array of physical problems. The mathematicians
that followed him enlarged and refined his techniques until they began to believe that the
entire future of the universe, as well as its past, could be discerned from a knowledge of
the current positions and velocities of all physical bodies and the forces at work between
them. In such a world view, nothing is undetermined in itself; what appears to us as
undetermined is simply a reflection of our ignorance of the forces involved. As an example
of this view, writing in 1795, Pierre Simon Laplace (1749-1827) said:
    Given for one instant an intelligence which could comprehend all the forces by which
    nature is animated and the respective situation of the beings who compose it    an
    intelligence sufficiently vast to submit these data to analysis  it would embrace in
    the same formula the movements of the greater bodies of the universe and those of the
    lightest atom; for it, nothing would be uncertain and the future, as the past, would be
    present to its eyes. *

  * P. S. Laplace, Essai philosphique sur les probabilitis (Paris, 1814), translated by F. W.
Truscott and F. L. Emory, A Philosophical Essay on Probabilities (New York, 1951), page
3.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Numerical Solutions of Differential Equations


Section 8.1


Today we know more about the limits to our knowledge, but, nevertheless, the study of
differential equations remains a key component to our understanding of the universe.
    To begin our study, consider the equation

                                    40(t = f (z(t), t)                           (8.1.2)

with the initial condition x(to) = xo. To simplify notation, we will frequently omit the
independent variable when referring to x(t) and write simply

                                          f(x,t).                                (8.1.3)

Now if f(x, t) depends only on the value of t, that is, if f(x, t) = g(t) for all values of
x, where g is a function of t alone, then we may solve (8.1.3) by integration. That is,
integrating both sides (8.1.3) gives us


                                   J (s)ds =    g(s)ds.                          (8.1.4)
                                / t            t
                                to            to

Substituting

                               f (s)ds - x(t) - x(to),
                               /o
into (8.1.4), we have

                               c(t) = x(to) +    g(s)ds.                         (8.1.5)
                                              /o
Assuming we can compute the definite integral, which, provided g is continuous, can at
least always be done numerically, we have solved the differential equation. This is the type
of differential equation we considered in Section 4.8.
Example Consider the equation

                                         4 sin(3t)

with initial condition x(O) = 5. Then


                               cx(t) = 5 +   4 sin(3s)ds
                                          4        t
                                    = 5 - - cos(3s)
                                          3        o
                                          4          4
                                    - 5 - - cos(3t) + -
                                          3          3
                                      1
                                      - (19 -4 cos(3t)).
                                      3

    More generally, suppose f in (8.1.3) depends on both x and t. In that case, since
the right-hand side of the equation involves the unknown function x, we cannot simply


﻿


Section 8.1


Numerical Solutions of Differential Equations


3


integrate both sides of the equation. We have solved some equations like this in earlier
sections, such as the inhibited population growth model

                                    c =   jx(M - x)

in Section 6.3, and we will discuss general techniques for finding exact solutions for several
types of equations in the coming sections. However, in this section we will concentrate
on direct numerical techniques for approximating solutions. Indeed, knowing the difficulty
of evaluating ordinary definite integrals, it is not hard to believe that many, if not most,
differential equations may be solved only through numerical approximation.
    Although the function x in equation (8.1.3) is unknown, we do have enough information
to find its best affine approximation at to. Namely, the best affine approximation to x at
to is
                    T(t) = xo + c(to)(t - to) = xo + f(xo, to)(t - to).    (8.1.6)
Hence, from our work with Taylor's theorem in Section 5.2 and assuming f(z(t), t) is a
continuous function, we have

                       m(t) =c o + f (o, to)(t - to) + 0((t - to)2).       (8.1.7)

Equivalently,
                          x(to + h) = co + hf (o, to) + O(h2).                    (8.1.8)
Thus for a small value of h,
                                  x1 = zo + hf (o, to)
will provide a good approximation to x(to + h). However, we want to do more than this;
since c is a function, we want to be able to approximate its values over an entire interval,
say [to, ti]. To do so, we choose a small value for h and iterate the process that gave us
x1. Specifically, we let s= to + kh, k= 0,1, ..., n, where n is chosen large enough that
sn > ti, and compute
                                 ck+1 =ck + hf(k, Sk)                             (8.1.9)
for k = 0, 1, ... , n - 1. That is, we compute an approximation for x(sk+1) by apply-
ing the best affine approximation to our approximation for x(sk), repeating the process
enough times until we have approximated x over the entire interval [to, ti]. This method
of approximation is known as Euler's method.
Euler's method To approximate a solution to the equation

                                         = f (Xt)

with initial condition cc(to) =cc0 on an interval [to, ti], choose a small value for h > 0
and an integer nt such that to + nth ;> ti. Letting s8k to + kh, k =0,1,... , nt, compute
cci, c2,... , ccn using the difference equation

                                cck+1 =cck + hf(cck, sk).                       (8.1.10)


Then ck is an approximation for x(t + kh).


﻿


4


Numerical Solutions of Differential Equations


Section 8.1


    Note that (8.1.10) makes use of a difference equation, a discrete time equation that we
first met in Section 1.4, in order to approximate the solution of a differential equation.
Example     Consider the differential equation ih= 0.04x with initial condition x(0) = 50.
From our work in Chapter 6 we know that the solution to this equation is

                                    x(t) = 50e0.04t

In particular, x(50) = 50e2 = 369.45, rounding our answer to the second decimal place.
To approximate x on the interval [0, 50] using Euler's method, we will first take h = 1 and,
starting with zo = 50, compute X1, X2,..., X50, where in this case ok will approximate
x(k). Using (8.1.10) with f(x, t) = 0.04x and rounding to two decimal places, we have


         x1 = zo + h(0.04xo)
         X2 = x1 + h(0.04x1)
         X3 = X2 + h(0.04x2)

and so on, ending with X50 =
for t = 0, 5,10, ... , 50:
                  t
                  0
                  5
                  10
                  15
                  20
                  25
                  30
                  35
                  40
                  45
                  50


= 50 + (1)(0.04)(50) = 50 + 2 + 52.00,
= 52 + (1)(0.04)(52) = 52 + 2.08 = 54.08,
= 54.08 + (1)(0.04)(54.08) = 54.08 + 2.16 = 56.24,

355.33. The following table gives the values of wt and x(t)


Xt
50.00
60.83
74.01
90.05
109.56
133.29
162.17
197.30
240.05
292.06
355.33


m(t)
50.00
61.07
74.59
91.11
111.28
135.91
166.01
202.76
247.65
302.48
369.45


Notice that the error in
increases as k increases.


our approximation, that is,
For example,


the difference between wk and x(k),


x(5) - x = 0.24,


whereas
                                 x(50) - x50= 14.12.
This should be expected since the error of the approximation on each step is compounded
by the errors made in each of the preceding steps. The only way we can control the amount
of error in our approximations is to decrease the step size. For example, if we reduce the
step size to h = 0.1, we obtain

   x1 = zo + h(0.04xo) = 50 + (0.1)(0.04)(50) = 50 + 0.2 = 50.2000,
   X2 = x1 + h(0.04x1) = 50.2 + (0.1)(0.04)(50.2) = 50.2 + 0.2008 = 50.4008,
   X3 = X2 + h(0.04x2) = 50.4008 + (0.1)(0.04)(50.4008) = 50.4008 + 0.2016 = 50.6024,


﻿


Section 8.1


Numerical Solutions of Differential Equations


5


400
350
300
250
200
150
100
50


10      20       30      40       50


Figure 8.1.1 Approximate solutions of Xi =


0.04x


and so on, ending with x500 = 367.98 as our approximation for x(50). In general, with
h = 0.1, xk is an approximation for x(0+kh) = x(0.1k). Equivalently, x(t) is approximated
by Xk where k = 10t, provided 10t is an integer. The following table gives the values of
xiot and x(t) for t = 0, 5, 10, ... , 50:


t
0
5
10
15
20
25
30
35
40
45
50


Xlot
50.00
61.05
74.53
91.00
111.10
135.64
165.61
202.19
246.86
301.40
367.98


X(t)
50.00
61.07
74.59
91.11
111.28
135.91
166.01
202.76
247.65
302.48
369.45


As expected, reducing the step size from h = 1 to h = 0.1 greatly reduces the error
in the approximations. Figure 8.1.1 shows the graphs of both our approximations to x
(the graph of x is not shown since, on this scale, it is essentially the same as our second
approximation).

Example The differential equation in the previous example could be used to model a
population which is growing at a rate of approximately 4% per unit of time, starting with
an initial population of 50. As a modification of this model, consider a case where the rate
of growth of the population is decreasing over time. For example, the rate of growth might
start out at 4% but decrease over time toward 2%. If x(t) is the population after t units
of time, then the differential equation


= 0.02(1 + e   )x


﻿


6


Numerical Solutions of Differential Equations


Section 8.1


                   500

                   400

                   300

                   200

                   100


                              20       40      60       80      100
             Figure 8.1.2 An approximate solution of ch 0.02(1 + e--o)X

would describe one such situation. To approximate the solution to this equation over
the interval [0, 100], again assuming an initial population of xo = 50, we will use Euler's
method with h = 0.05 and compute x1, x2, ..., x2000. Then, with
                              f(x,t) = 0.02(1 + e--10)x,
we have
     x1 = xo + hf (50, 0) = 50 + (0.05)(2) = 50 + 0.1 = 50.10000,
     X2 = x1 + hf (50.1, 0.05) = 50.1 + (0.05)(1.99900) = 50.1 + 0.09995 = 50.19995,
     x3 = X2 + hf (50.19995, 0.10) = 50.19995 + (0.05)(1.99801)
        = 50.19995 + 0.09990 = 50.29985,
and so on, ending with x2000= 450.91 as our approximation to x(20), where we have
rounded xi, x2, and x3 to five decimal places and x2000 to two decimal places. In general,
ok is an approximation for x(0 + 0.05k) =x(0.05k); that is, x(t) is approximated by
X20t. The following table lists the values of x2Ot, rounded to two decimal places, for
t = 0, 10, 20,..., 100:
                              t           x20t (~x(t))
                              0            50.00
                              10           69.30
                              20           88.67
                              30           110.17
                              40           135.40
                              50           165.74
                              60          202.59
                              70          247.50
                              80          302.30
                              90          369.20
                            100           450.91
The graph of our approximate solution is shown in Figure 8.1.2.


﻿


Section 8.1


Numerical Solutions of Differential Equations


7


    As we have seen, the accuracy of Euler's method depends on the step size h. In theory,
we can obtain any level of accuracy we desire by choosing h small enough; however, in
practice there are limitations on how small we can make h. These limitations include
the level of precision with which a given computer represents numbers, the time it takes
to perform the necessary computations (smaller values of h require more iterations of
(8.1.10)), and the accumulation of round-off error resulting from requiring large numbers
of iterations. Fortunately, there are many ways to improve upon Euler's method, one of
which we will consider now.
    For the equation
                                       = f(x,t)

with x(to) x= o, Euler's method is based on using hf(xo, to) as an approximation to the
difference x(to + h) - xo. The accuracy of this approximation will depend upon how much
c(t) differs from f(xo, to) over the interval [to, to + h]. In general, the accuracy of our
approximation will improve if we use

                                            h
                                      ( to + -2


the slope of x at the midpoint of the interval [to, to + h], in place of c(to) = f(xo, to), the
slope of x at the left-hand endpoint of this interval. Now

                         i(to+) - = f(x(to+-),to+).                           (8.1.11)

However, we do not know x (to + 2). To get around this difficulty, we will first use Euler's
method to approximate x (to + 2) by

                                        h
                                   xo + -f(xo,to),

and then approximate ch (to + 2) by


                             f (o + -f(xo,to),to + -i)                        (8.1.12)

Replacing f(xo, to) by (8.1.12) in Euler's method, we have


                        x1 =Xz + hf zo + -f(zo, to), to + -(8.1.13)
                                           22

as our approximation for cc(to + h). It can be shown that in this case

                               x(to + h) - x1 0 (ha),

whereas we have seen that the error in one step of Euler's method is 0(h2). This method of
approximation is known as the Run ge-Kutta method of order 2, after the mathematicians


﻿


8


Numerical Solutions of Differential Equations


Section 8.1


Carl Runge (1856-1927) and M. W. Kutta (1867-1944). In general, an approximation
method is said to be of order n if the error in one step is O(hn+l). There are Runge-Kutta
formulas for approximations of higher order, but we will consider only this order 2 formula.
Second order Runge-Kutta To approximate the solution of the equation

                                       = f(xt)

with initial condition x(to) =wzo on an interval [to, ti], choose a small value for h > 0 and
an integer n such that to + nh > ti. Letting sk= to + kh, k =0,1,2, ... n, compute
                                        h
                                   m =-f(xk, Sk)

and
                                             /h
                          Xk+1 = Xk + hf (wk + m, sk + 2)(8.1.14)

for k = 0, 1, ... n - 1. Then ok is an approximation for x(to + kh).
Example     We will approximate the solution of ih= 0.04x with x(0) = 50 on the interval
[0, 50] using the second order Runge-Kutta method with h = 0.1. Using f(x, t) = 0.04x
and rounding to four decimal places, we have
                x1 = 50 + 0.1f (50 + 0.05f (50, 0), 0.05)
                   = 50 + 0.1f (50.1, 0.05)
                   - 50 + 0.2004
                   - 50.2004,
                x2 = 50.2004 + 0.lf (50.2004 + 0.05f (50.2004, 0.1), 0.15)
                   = 50.2004 + 0.lf (50.3008, 0.15)
                   - 50.2004 + 0.2012
                   - 50.4016,

and so on, up to o500o= 369.4508. Since h = 0.1, Xiot gives us an approximation for
x(t) whenever lOt is an integer. The following table gives the values of Xiot and x(t) for
t = 0,5,10, ..., 50:
                 t                  Xiot            X(t)
                 0                50.0000          50.0000
                 5                61.0701          61.0701
                 10               74.5912          74.5912
                 15               91.1058          91.1059
                 20               111.2768        111.2770
                 25               135.9137        135.9141
                 30               166.0053        166.0058
                 35               202.7592        202.7600
                 40               247.6506        247.6516
                 45               302.4809        302.4824
                 50               369.4508        369.4528


﻿


Section 8.1


Numerical Solutions of Differential Equations


9


Comparing these values with the values we obtained with Euler's method for the same
equation, we see that we have decreased our error significantly without decreasing the step
size. Hence we have gained accuracy without greatly increasing the number of computa-
tions required.

Example In Problem 7 in Section 4.8 (which is repeated in Problem 2 at the end of
this section) we discussed the problem of a body in free fall near the surface of the earth,
neglecting the effects of air resistance. In that case, if the body has mass m and F is
the force acting on the body, then, since we are neglecting all forces except for that of
gravity, F = -mg, where g = 32 feet/second2. However, if we consider the effects of
air resistance, then we have to modify F to account for this additional force acting in a
direction opposite to that of gravity. One common assumption is that air resistance is
proportional to velocity; in that case, we have


                                    F = -32m - kv,                              (8.1.15)


where k > 0 is a constant which depends on the particular body and v is the velocity of the
body. Note that kv < 0 since v < 0, so the additional force -kv is acting in the opposite
direction of gravity. Now if a is the acceleration of the body, F = ma implies


                                   ma = -32m - kv,                              (8.1.16)


and so
                                                k
                                    a --32 - -v.                                (8.1.17)
                                               m

Since a = v, (8.1.17) gives us the differential equation


                                     v =-32 - cv,                               (8.1.18)


where
                                             k
                                         c =
                                             m

is a constant which depends both on the mass of the body and its air resistance. For
example, suppose we have a situation where c =0.1 and the body is released from rest
high above the ground. Then we are interested in the solution of the equation

                                    v    -32 - 0.1v


with the initial condition v(0) =0. The following table gives the results from applying the
second order Runge-Kutta method with step size h =0.1 over the interval [0, 70]:


﻿


10


Numerical Solutions of Differential Equations


Section 8.1


                              t             v ~vl    (t))
                              0                 0.0
                              5             -125.9
                              10            -202.3
                              15            -248.6
                              20            -276.7
                              25            -293.7
                              30            -304.1
                              35            -310.3
                              40            -314.1
                              45            -316.4
                              50            -317.8
                              55            -318.7
                              60            -319.2
                              65            -319.5
                              70            -319.7


                              10    20     30    40     50    60     70
                    -50

                    -100

                    -150

                    -200

                    -250

                    -300


                        Figure 8.1.3 Velocity of a body in free fall


From the table of values and from the graph of the approximate solution in Figure 8.1.3, it
appears that the velocity of the body approaches a limiting value. We shall see in the next
section that this is indeed the case. For this example, the velocity will approach -320 feet
per second. We call this velocity the terminal velocity of the body.

Problems

1. Solve each of the following differential equations using the given initial condition.

    (a) i =t2 -2, x(0) =3                       (b)    =- sin(t), x(0) =2


 2. Let cc(t), v(t), and a(t) be the height, velocity, and acceleration, respectively, of an
    object of mass m in free fall near the surface of the earth. Let zo and vo be the


﻿


Section 8.1


Numerical Solutions of Differential Equations


11


   height and velocity, respectively, of the object at time to. If we ignore the effects
   of air resistance, the force acting on the body is -mg, where g is a constant (g
   9.8 meters/second2 or g = 32 feet/second2). Thus by Newton's second law of motion,

                                     -mg =ma(t),

   and so
                                       a(t) - -g.

   Show that
                                         12
                                x(t_) =-2 gt2 + vot + zo.

3. Suppose an object is projected vertically upward from a height of 100 feet with an
   initial velocity of 20 feet per second. Use Problem 2 to answer the following questions.
   (a) Find x(t), the height of the object at time t.
   (b) At what time does the object reach its maximum height?
   (c) What is the maximum height reached by the object?
   (d) At what time will the object strike the ground?
4. For each of the following, use Euler's method to approximate the solution of the equa-
   tion over the given interval I using the step size h specified. Plot your result.
   (a)   =e-22 x(0) = 1, I = [0,10], h = 0.1
   (b) ch -0.96x, x(0) = 100, I = [0, 100], h = 0.5
   (c)   =0.05y - 5, y(O) = 200, I = [0, 50], h = 0.05
   (d) i =0.002x(100 - x), x(0) = 10, I = [0, 200], h = 0.1
   (e) y   0.02(1 + 0.5e--)y, y(O) = 10, I= [0,150], h = 0.5
   (f)i= cos(x2), x(0) = 0, I = [0, 10], h = 0.01
   (g) ch 0.045x + 0.4t, x(0) = 30, I = [0, 40], h = 0.1
   (h) ch 0.1x + cos(t), x(0) = 0, I = [0, 10], h = 0.05
5. Use the second order Runge-Kutta method to approximate solutions to the equations
   in Problem 4.
6. In 1990 the population of India was 853.4 million. If P(t) is the population of India t
   years after 1990, suppose P satisfies the differential equation

                                       P'   k(t)P,

   where k(t) is the rate of growth of the population at time t. For example, at the start
   of 1990 the population of India was growing at a rate of 2.1% per year, so k(0) =0.021.

   (a) Suppose the rate of growth remains constant; that is, suppose k(t) =0.021 for all
       t;> 0. Find P(t). In what year will the population reach 1 billion? In what year
       will it reach 2 billion? In what year will it reach 3 billion?


﻿


12


Numerical Solutions of Differential Equations


Section 8.1


   (b) Now suppose k(t) is decreasing toward 1% in such a way that

                                   k(t) = 0.01(1 +1.le--).

       Use the second order Runge-Kutta method to approximate P over the interval
       [0, 100] using a step size of h = 0.1. In what year will the population reach 1
       billion? In what year will it reach 2 billion? In what year will it reach 3 billion?
   (c) Plot your results from (a) and (b) together.
7. In the final example in this section we considered the problem of an object in free fall
   when the air resistance is proportional to the velocity of the object. Now consider the
   case where the air resistance is proportional to the square root of the speed.
   (a) If s is the speed of the object, in feet per second, t seconds after it is released,
       explain why s satisfies the differential equation

                                        s=32-c s,

       s(0) = 0, for some constant c > 0.
   (b) Using c = 1, use the second order Runge-Kutta method to solve the equation in
       (a) over the interval [0, 500] using a step size of h = 0.5. Plot your results.
   (c) Does the object appear to approach a limiting speed? If so, what is the terminal
       speed?
   (d) Solve the equation
                                         32-c ds= 0

       for s. Explain the connection between this answer and your answer in (c).
8. In the final example in this section we considered the problem of an object in free fall
   when the air resistance is proportional to the velocity of the object. Now consider the
   case where the air resistance is proportional to the square of the velocity.
   (a) If v is the velocity of the object, in feet per second, t seconds after it is released,
       explain why v satisfies the differential equation

                                         = -32 + cv2,

       v(0) = 0, for some constant c > 0.
   (b) Using c = 0.01, use the second order Runge-Kutta method to approximate the
       solution to the equation in (a) over the interval [0, 20] using a step size of h =0.02.
       Plot your results.
   (c) Does the object appear to approach a limiting velocity? If so, what is the terminal
       velocity?
   (d) Solve the equation
                                        -32 + cv2 =0
       for v. Explain the connection between this answer and your answer in (c).


﻿


Section 8.1


Numerical Solutions of Differential Equations


13


9. Suppose the population of a certain country was 56 million in 2000 and the natural rate
   of the growth of the population was 2% per year. Moreover, suppose k(t) represents
   the net rate of growth of the population due to immigration and emigration t years
   after 2000.
   (a) Let P(t) be the population of the country t years after 2000. Explain why P should
       satisfy the differential equation

                                      P =0.02P + k(t),

       with P(0) = 56.
   (b) If k(t) = 0.04t, use the second order Runge-Kutta method to approximate the
       solution to the equation in (a) over the interval [0, 25] using a step size of h = 0.05.
       Plot your results.
   (c) What does this model predict for the population of the country in the year 2010?
   (d) When will the population of the country reach 100 million?


﻿


Section 8.2


        Differential Equations         Separation of Variables


In the previous section we discussed two methods for approximating the solution of a
differential equation
                                      =i= f (x, t)
with initial condition x(to) = xo. We will now consider, in this section as well as in
Sections 8.3 and 8.4, techniques for finding closed form solutions for such equations, that
is, solutions expressible in terms of the elementary functions of calculus. To do so will
require considering different classes of equations depending on the form of the function f.
As in ordinary integration, finding a closed form expression for the solution of a differential
equation is frequently a difficult, if not impossible, problem which requires us to exploit
whatever information we can gain from the form of the function. In this section we will
consider a class of equations known as separable equations and in Sections 8.3 and 8.4 we
will consider linear equations.
    We call a differential equation
                                        = f (x, t)                              (8.2.1)
with initial condition x(to) = xo separable, or say it has separable variables, if f(x, t)
g(x)h(t) for some functions g and h, where g depends only on x and h depends only on t.
We will assume that g and h are both continuous and hence, in particular, integrable. In
that case, (8.2.1) becomes
                                       = g(x)h(t)                               (8.2.2)
which implies that

                                           = h(t)                               (8.2.3)
                                     g(x)
at all points for which g(x) 0. Integrating (8.2.3) from to to t (assuming g(x(s))   0 for
all s between to and t), we have


                              Io g(x(s)) (s)ds =   h(s)ds,                      (8.2.4)

where we have used s as the variable of integration so that our answer will be in terms of
t. Now the substitution
                                         = z(s)
                                      du = fs(s)
gives us
                     It 1     p   x     t)   1         if   1
                        /  )(s)ds     J          du) =du =     du               (8.2.5)
                     to0 9(s)         Jx(to) 9(u)       o9u)


1


Copyright @ by Dan Sloughter 2000


﻿


2


Separation of Variables


Section 8.2


for the integral on the left-hand side. Hence, putting (8.2.4) and (8.2.5) together,


                                      du = I  h(s)ds.                       (8.2.6).


    Thus we can solve an equation with separable variables provided we are able to evaluate
both of the integrals in (8.2.6) and then solve the resulting equation for c. The process
may break down at either of these final two steps, in which case we must fall back on
numerical approximations even though the equation is separable.

Separation of variables If g and h are continuous functions of x and t, respectively,
and x satisfies the differential equation

                                   c   g(x)h(t)                             (8.2.7)

with x(to) =cco, then

                                      du   j h(s)ds,                        (8.2.8)

provided g(u) # 0 for all u between co and c.
    Note that this is the same method we used to solve the inhibited growth model equation
in Section 6.3.
Example Consider the equation
                                         0.4x

with x(0) = 100. This is a separable equation with, in the notation used above, g(x) = x
and h(t) = 0.4 . (Note that the choices for g and h are not unique.) Using (8.2.8), we have


                                     du       0.4ds,

Now, assuming x> 0,


                fX1udu = log(u)l    = log(x) - log(1000) =log(     ,
                1100             10010

and

                               f0.4ds =0.4s=0..

Hence we have
                                 log       =0.4t,
from which we obtain

                                    100


﻿


Section 8.2


Separation of Variables


3


and, finally,
                                  x = 100eo.4t.

Note that this is the solution we should expect from our study of equations of this form
in Sections 6.1 and 6.3.
Example Consider the equation
                                        -2yt                              (8.2.9)

with y(O) = yo # 0. This is a separable equation with, in the notation used above, g(y) = y
and h(t)  -2t . Using (8.2.8), we have


                              Is  - du  = - 2sds.

Now
                   fyiy
                     - du=log lu     =log ly -log Yo llog
                     o uo
and
                            -    2sds =-s2      -t2
                               Joo0
Hence we have
                                 log     =-t2
                                     Yo
from which it follows that

                                   Yo
Now e-t2 > 0 for all t, so y(t) is never 0. Since y is continuous (which follows from our
assumption that it is differentiable), this means that either y(t) > 0 for all t or y(t) < 0
for all t. Since y(O) = yo, y(t) > 0for all t if yo > 0 and y(t) <0 for all t if yo <0. In
either case,
                                    y(t)>0
                                    Yo
for all t, so

                                    Yo    Yo
Hence we have

                                   Yo
or
                                            y = ye-I .(8.2.10)

Note that (8.2.10) also specifies a solution of (8.2.9) when yo =0, namely, the solution
y(t) =0 for all t. By leaving the value of yo unspecified, we have found the general form of
all possible solutions for the equation. We call the family of all possible solutions given by
(8.2.10) the general solution of the equation (8.2.9). Any solution obtained by specifying


﻿


4


Separation of Variables


Section 8.2


10


5


2       2.5


5


-10 F

Figure 8.2.1 Four particular solutions of y


.2yt


a value of yo, say,
Figure 8.2.1 shows


for example, yo = 10, is called a particular solution of the equation.
the graphs of four particular solutions for this equation.


    As noted in the first example, the choices for g and h are not unique. For example, in
the second example we could just as well have taken g(y) = 2y and h(t) = t. However, one
should attempt to choose g and h in such a way that the subsequent steps in the solution
are as simple as possible.
Example Consider the equation
                                              t


                                              x
with cc(0)= xo 54 0. Separating the variables, we have


Ix
   udu
x o


It
   sds.
 0


Now


x            x
I  udu


2 _2
  x-0


and


  sds
o0


   t
-s2
   o


and so


x2    2 x2 = -t2


.2     2    2


or


This equation implicitly defines x as a function of t. Indeed, from this equation we can see
that the graph of x is part of circle of radius xo centered at the origin. Solving explicitly
for x, we have
                                     x =    zo -t2


﻿


Section 8.2


Separation of Variables


5


if xo > 0 and
                                  x=-     xo-t2

if xo <0. Note that x is only defined for -o < t < xo.
Example In Section 8.1 we considered the equation

                                             k
                                   v   -g --,
                                            m

with v(0) = 0, as a model for the velocity of an object in free fall near the surface of the
earth when the force due to air resistance is proportional to velocity. Here v is the velocity
of the object, g, as usual, is 32 feet per second per second or 9.8 meters per second per
second, m is the mass of the object, and k > 0 is a constant which depends on the air
resistance of the particular object. If we write this equation in the form

                                 v = -g(1 + -v                              (8.2.11)
                                             gm

and separate variables, using
                                              k
                                  f(v)1+      -v
                                             gm
and
                                     h(t)= -g
then we have
                             fv  1 dd
                               JO1~u i        -   gs.

Now
               /      l       n             k    V   gmm         k v
               J v    k   du = gmlog 1 +    - u v    mlog 1 +     - v
                  1+    u        k         gm    o    k         gm

and
                              - fJgds = -gs = -gt,

so
                                  log 1 + -v =-gt.
                               k         gm
Hence
                                        k        kt
                               logl1+ -v,
                                       gmm
from which it follows that
                                      k        ___
                                  1+   -V =C--.
                                     gm
Thus either
                                       k        kt
                                  1+   -v=e--
                                      gm


﻿


6


Separation of Variables


Section 8.2


                             10    20    30     40    50    60    70
                   -50

                   -100

                   -150

                   -200

                   -250

                   -300


                     Figure 8.2.2 Graph of v(t)  -320(1 - e-0o.t)


or
                                       k          kt
                                  1+     v = -e-
                                      gm
That is, either
                                       gm a
                                 v=- (1-em)
                                        km)

or
                                       gm        kt
                                 v=       (1+eM).

Since our initial condition requires that v(0) = 0, we must have

                                       gm         t
                                 v=-      (1-e m).

Hence we now have a closed form solution for this model of free fall, whereas in the previous
section we could only compute a numerical approximation. Notice that one advantage of
the closed form solution is that we did not have to specify values for the parameters k and
m before finding the solution; as a result, we may now easily compute v for any specified
values of k and m. For example, using  = 0.1 and g = 32 as in our example in Section
8.1, we obtain a plot of v as shown in Figure 8.2.2. You should compare this with the
graph of our numerical solution in Figure 8.1.3. Also, the closed form solution allows us
to compute
                        lim v(t)= lim       (gm(1 - e-m)=-    ,                (8.2.12)
                        t-oo      t-oo    k                 k
showing that an object falling according to this model has a terminal velocity, as we
suspected from our numerical work in Section 8.1. Moreover, (8.2.12) gives us a general
expression for this velocity. For our example, = 0.1 and g =32 give us a terminal
velocity of
                                 --320 feet per second.
                               k


﻿


Section 8.2


Separation of Variables


7


Problems

1. Solve each of the following differential equations using the given initial condition.
    (a) ih  -0.9x, x(0)= 75                   (b)     x =c2, x(0) = 10
            t(w
            y                                           t


(e)

(g)


   t        x(0) = 4
   x + t
c =x(1 - x), x(0) = 0.2


Y" Y(O) = 0


2.


(a) Solve the differential equation i = -c2t, x(0) = zo # 0.
(b) Graph x on the interval [5, 5] for zo = 2, zo = 5, and zo = 10. Are t
    similar?
(c) What is the domain of x if co > 0? What is the domain of x if oc < 0?
(d) Graph x for zo = -1 and zo = 1. Are the graphs similar?


he graphs


3. (a) A curve is defined so that whenever (xo, yo), with yo # 0, is a point on the curve,


dy
dc (x,y)=(xo,yo)


aco
byo


       where a > 0 and b > 0 are constants. Show that the curve must be an ellipse.
       Under what conditions is the curve a circle?
   (b) A curve is defined so that whenever (o, yo), with yo # 0, is a point on the curve,

                                   dy               aco
                                   dc (x,y)=(xo,yo)  byo

       where a > 0 and b > 0 are constants. Show that the curve must be a hyperbola.
4. In Chapter 6 we considered the consequences of the population growth model

                                            kx,

   with x(0) =o , where x(t) represents the size of some population at time t and k > 0
   is a constant which depends on the rate at which the population is growing. In this
   problem we will see what happens if i is proportional, not to x, but to some power of
   x. That is, consider the model


                                       i=kccb,

with x(0) =o and b > 0 a constant.
(a) Solve (8.2.13) when b = 2 and show that


(8.2.13)


lim x(t)
kx0


DC.


Plot x for co =50 and k =0.001, k =0.01, and k =0.02.


﻿


8


Separation of Variables


Section 8.2


   (b) Solve (8.2.13) when b > 1. Find c such that

                                        lim x(t) =00.

       Plot x for xo = 50, k = 0.01, and b = 1.5, b = 1.2, and b =1.01.
   (c) Solve (8.2.13) when b = 0.5. Show that x is a quadratic polynomial and

                                        lim x(t) =00.
                                        t-oo

       Plot x for x0 =50 and k =0.01, k =0.02, and k =0.05.
   (d) Solve (8.2.13) when 0 < b < 1 and show that

                                        lim x(t) =00.
                                        t-oo

       Plot x for xo = 50, k = 0.01, and b = 0.2, b = 0.4, and b = 0.9.
   (e) Compare the rates of growth for 0 < b < 1, b = 1, and b > 1. Which model leads to
       the slowest population growth? Which model leads to the most rapid population
       growth? Why is the case b > 1 sometimes referred to as the doomsday model?
5. Suppose the force due to air resistance acting on a falling body of mass m is propor-
   tional to the square of the velocity v.
   (a) Explain why v satisfies the differential equation

                                          " ~k2
                                       v = -g +-v2
                                                 m

       where k > 0 is a constant.
   (b) Assuming v(0) = 0, solve the equation in (a) for v.

   (c) Show that the terminal velocity of the object is - k

   (d) Plot v over the interval [0, 20] using g = 32 and m = 0.01. Compare this plot with
       the plot of the numerical solution found in Problem 8 of Section 8.1.
6. In Section 1.4 we discussed the discrete time version of Newton's law of cooling. Briefly,
   this law says if an object with an initial temperature of To is placed in an environment
   which is held at a constant temperature S, then the rate of change of the temperature T
   of the object is proportional to the difference between T and S. In terms of differential
   equations, this says that T must satisfy the equation

                                      T =k(T -S)

   for some constant k.
   (a) Show that
                                    T=S + (To -S jekt


﻿


Section 8.2                           Separation of Variables                          9

        and verify that
                                          lim T (t) = S.
                                          t-oo

    (b) A cup of coffee, initially at a temperature of 115 F, is placed on a table in a room
        held at a constant temperature of 72°F. If after five minutes the coffee has cooled
        to 105°F, what is the temperature of the coffee after 20 minutes? How long will it
        take the coffee to cool to 80°F? Graph T.
    (c) A glass of lemonade, initially at a temperature of 40°F, is placed on a table in a
        room held at a constant temperature of 75°F. If after 10 minutes the lemonade has
        warmed to 48°F, what is the temperature of the lemonade after 30 minutes? How
        long will it take the lemonade to warm to 65°F? Graph T.
    (d) A cup of coffee, initially at a temperature of 110 F, is placed on a table in a room.
        After five minutes the coffee has cooled to 100°F and after ten minutes the coffee
        has cooled to 92°F. What is the temperature of the room?


﻿


Section 8.3


                to                    First Order Linear
       Differential Equations      Differential Equations


We will now consider closed form solutions for another important class of differential
equations. A differential equation
                                     =i f(xt)


with x(to)


xo is called a linear equation if


f (x, t) = p(t)x + q(t)


(8.3.1)


for some functions p and q which depend only on t. We will assume that both p and q are
continuous functions. Note that under certain circumstances, such as q(t) = 0 for all t, a
linear equation is also separable. The solution of such equations is based on the following
observation: If we let


P(t) =    p(s)ds,
        Jto


(8.3.2)


(8.3.3)


then


d


P(t))


-xp(t)C-P(t) + '-


P(t)


e-P(t) (- - p(t)x).


Now we want i = p(t)x + q(t), that is, - p(t)x
such that


q(t), so we are looking for a function x


dd (xc-
dt,


P(t) )


q(t)e


P(t)


(8.3.4)


Integrating (8.3.4) from to to t (using u for our variable of integration), we have


It
t0


d (x(u)ec


")) du


q(u)e


P(u) du.


(8.3.5)


Now


I t
  to


d (x(u)e
du


P(u)) du


           to
z(u)e-PC")t

x(t)e-P(t) - x(to)e
x(t)e-P(t) - xo


P(to)


(8.3.6)


since P(to)


0 and x(to)


xo. Hence we want


x(t)6-P(t) - xo


  q(u)e
0tt


P(u) du.


(8.3.7)


1


Copyright @ by Dan Sloughter 2000


﻿


2


First Order Linear Differential Equations


Section 8.3


Solving (8.3.7) for x(t), we have

                        x(t) = eP(t) (fq(u)e--P()du + zo .                     (8.3.8)

Similar to our situation with separable equations, (8.3.8) provides a closed form solution
to our equation only if the requisite integrals may be computed in closed form. If not,
numerical techniques will be necessary.
Linear equations If p and q are continuous, x satisfies the differential equation

                                   i =p(t)x + q(t)                             (8.3.9)

with x(to) x=0o, and

                                  P(t)   fp(s)ds,                             (8.3.10)

then
                        x(t) = eP(t)(    q(u)e-P() du +  o .                  (8.3.11)

Example Consider the equation
                                     x = - +4t
                                          t
with x(1) = 5. This is a linear equation with, in the notation used above,

                                          PM 1
                                      p(t) 1
                                             t
and
                                      q(t) = 4t.
Then
                                  it 1t
                          P(t)      - ds =log(s) = log(t),
                                   /S             1
                                   1 s
where, in order for the integral to exist, we have restricted t to be positive. Thus, using
(8.3.11),

                          S= elg(t)      4ue-   ()du + 5

                            =t/f 4u du+5)


                            = t 4d +


                            = t(4t - 4) + 5t
                            =4t2 + t.


﻿


Section 8.3


First Order Linear Differential Equations


3


500

400

300

200

100


           20      40       60       80      100

Figure 8.3.1 A solution of c= 0.02(1 + e-o.t)x


Example


The equation


                                       ch=p(t)x

with x(0) = x0 may be used as a model for growth or decay where the rate of growth
or decay is not necessarily a constant. Such an equation may be solved by separating
variables, but the solution also follows from (8.3.11): Since q(t) = 0 for all t, we have

                                   x = x0e. o p(s3ds

For example, if p(t) = k for all t, where k is a constant, then we obtain the familiar solution

                                      X = xoekt

If p(t) = 0.02(1 + e-o.lt), as in an example in Section 8.1, then


Jno t
   p(s)ds


0.02(1 + e-0.1)ds


                                                     t
                                 = 0.02(s - 10-0. is)

                                 = 0.02(t - 10e-0.1t) - 0.02(-10)
                                 = 0.02t - 0.2e-0t + 0.2.

Hence
                                 _    e0.02t-0.2e-0.1t +0.2

The graph of x when x0 = 50 is shown in Figure 8.3.1. You should compare this with the
plot of an approximate solution in Figure 8.1.2.
Example A small reservoir holds 10,000 cubic feet of water. Water flows in at a rate
of 100 cubic feet per hour and out at the same rate. Suppose initially the water in the
reservoir contains 5 grams of salt per cubic foot, but the water flowing in contains 10 grams


﻿


4


First Order Linear Differential Equations


Section 8.3


                   100000

                   90000

                   80000

                   70000

                   60000


                                 100     200      300     400      500

                   Figure 8.3.2 Graph of x = 100, 000 - 50, 000Oe-.Oit

of salt per cubic foot. Let x(t) be the amount of salt in the reservoir after t hours. Note
that salt is entering the reservoir at a rate of 1000 grams per hour. Assuming the water in
the reservoir is well-mixed at all times, the concentration of salt in the reservoir at time t
is
                                          x(t)
                                          10, 000
grams per cubic foot, from which it follows that salt is leaving the reservoir at a rate of

                                   100   (t)     x(t)
                                       10, 000 100
grams per hour. Thus the rate of change of salt in the reservoir is given by
                                       1000 - 0.01x.
That is, x satisfies a linear differential equation with p(t) = -0.01 and q(t) = 1000. Then

                             P(t)    -    0.01ds   -0.01t,

and so, using x(0) =(5)(10, 000) = 50, 000 grams, we have

                     x - e-.0    (ft1000e0Oldu + 50, 000
                                               t
                       = e-o.Ol (i100,00e0.O0" + 50, 000)

                          = e-.olt(100, 000e0.Olt - 100, 000 + 50, 000)

                          =100, 000 - 50, OO0e-0.0 f .

In particular, note that
                   lim w(t) =lim 100, 000 - 50, 00e0         100, 000,
                   t- oo      t- oo
and we see that over time the concentration of salt in the reservoir will approach the
concentration of salt in its intake water. The graph of x is shown in Figure 8.3.2.


﻿


Section 8.3


First Order Linear Differential Equations


5


Problems

1. Solve each of the following linear differential equations.
    (a)     3x + 2t, x(0) = 2                  (b)   = 2x - t2 , x(0) = 1
    (c) y   0.4y + 3, y(O) = 5                 (d) w = -w + e-2t , w(0) = 3

    (e)     2x = + t2 , x(1) = 4               (f) y = -y + 2e-t + t2 , y(0) = 1
             t
 2. In 1990 the population of Botswana was 1.2 million. If x(t) is the population of
    Botswana t years after 1990, suppose x satisfies the differential equation

                                            k(t)x,

    where k(t) represents the rate of growth of the population at time t. At the start of
    1990 the population of Botswana was growing at the rate of 2.9%, so k(0) = 0.029.
    (a) Suppose the rate of growth of the population is decreasing toward 1.5% in such a
        way that
                                 k(t) = 0.015(1 + 0.93e-0.04t)

        Solve for x.
    (b) Compare your result in (a) with the result for a constant rate of growth of k(t)=
        0.029 by plotting both solutions together.
 3. Suppose the population of a certain country was 56 million in 2000 and the natural rate
    of the growth of the population was 2% per year. Moreover, suppose k(t) represents
    the net rate of growth of the population due to immigration and emigration t years
    after 2000.
    (a) Let y(t) be the population of the country t years after 2000. Explain why y should
        satisfy the differential equation

                                      dy
                                          =0.02y + k(t),
                                       dt

        with y(O) = 56.
    (b) Solve the equation if k(t) = 0.04t. Plot your results.
    (c) What does this model predict for the population of the country in the year 2010?
    (d) When will the population of the country reach 100 million?
    (e) Compare your results with the numerical results obtained for the same problem in
        Problem 9 of Section 8.1.
 4. A 500 gallon tank is initially filled with water with a concentration of 4 grams of salt
    per gallon. Water flows into the tank at the rate of 10 gallons per minute and is drawn
    off at the same rate. The concentration of salt in the intake water is 2 grams per
    gallon. Assume that the water in the tank is well-mixed at all times.


﻿


6


First Order Linear Differential Equations


Section 8.3


   (a) Let x(t) be the amount of salt in the tank at time t. Find a linear differential
       equation for x which models this situation.
   (b) Solve the equation from (a) and graph the solution. What happens to x as t
       increases?
5. Suppose a tank holds V liters of a liquid which contains a certain chemical at a con-
   centration of ko grams per liter. Liquid flows into the tank at rate of q liters per second
   and is drawn off at the same rate. The concentration of the chemical in the intake
   liquid is k grams per liter.
   (a) If x (t) is the amount of the chemical in the tank at time t, show that x satisfies
       the linear differential equation

                                         x=qk - --x
                                                  V

       with x(0) = koV.
   (b) Solve the equation in (a). What happens to x as t -0?
6. An equation of the form
                                      = p(t)x + q(t)x"                          (8.3.12)
   is called a Bernoulli equation. Note that the equation is linear if either n = 0 or n = 1.
   (a) Assume n # 0 and n # 1. Show that the substitution w = x1-- in (8.3.12) results
       in the linear differential equation

                                 = (1 - n)p(t)w + (1 - n)q(t).

   (b) Use the result of (a) to solve the equation

                                                 2
                                            X~--x
                                              t

       with x(1) = 1.
   (c) Use the result of (a) to solve the equation

                                         x =   (1 - x)

       with x(0) =0.5.
7. Note that (c) of Problem 6 is a particular example of the logistic differential equation
   that we studied in Section 6.3 in our discussion of the inhibited population growth
   model. In general, we considered the logistic equation

                                     ih =cx(M )
                                          M

   with xc(0) =czo, where cc(t) is the size of the population at time t, a is the natural
   growth rate of the population, and M is the maximum size of the population that is


﻿


Section 8.3                  First Order Linear Differential Equations                 7

    sustainable in the given environment. Write this equation in the form of a Bernoulli
    equation and use the result from (a) of Problem 6 to show that

                                               M
                                            1 + 6e-at

    where
                                            M-xO
                                               xo


﻿


Section 8.4


                to                     Second Order Linear
       Differential Equations          Differential Equations


To this point we have we have considered only first order differential equations. However,
many of the most interesting differential equations involve second derivatives. Indeed,
since acceleration is the second derivative of position, Newton's second law of motion,
F = ma, is a second order differential equation. In general, if f is a known function of
three variables, then the equation
                                      z= f (i, z, t)                            (8.4.1)

is called a second order differential equation. If we let y =i, then (8.4.1) may be written
as a pair of first order differential equations


                                                                                (8.4.2)
                                         f (y, x, t).

Hence moving from the study of first order differential equations to the study of second
order differential equations is analogous to moving from the study of one algebraic equation
in one unknown to the study of two algebraic equations in two unknowns. We will make
use of this fact when we consider numerical approximations to solutions of second order
equations in Section 8.6.
    As was the case with first order equations, the existence of a closed form solution to a
second order differential equation and our ability to find one when it exists depends very
much on the form of the function f in (8.4.1). We shall consider closed form solutions
for only one class of such equations, leaving other equations for either the numerical ap-
proximations of Section 8.6 or the infinite series techniques of Section 8.7. Here we are
concerned with equations of the form

                                   z + bi + cx =0,                              (8.4.3)

which we call a second order homogeneous linear differential equation with constant coeffi-
cients, corresponding to
                                 f~i x, t) = -bi - cx

in (8.4.1). The term homogeneous refers to the fact that the function x(0) = 0 for all t is
a solution of the equation and the phrase constant coefficients refers to the fact that b and
c are assumed to be constants.
    To begin our study of these equations, suppose xi (t) and x2 (t) are both solutions of
(8.4.3) and let x(t) = cizi(t) + c2x2(t) for constants ci and c2. Then


1


Copyright @ by Dan Sloughter 2000


﻿


2


Second Order Linear Differential Equations


Section 8.4


              z + bi + cx = (cizii + c222) + b(cii + c2c2) + c(cixi + c2x2)
                          = c1(zi + bci + cxi) + c2(2 + bz2 + cx2)
                             (ci)(0) + (c2)(0) = 0.
That is, x is also a solution of (8.4.3).
Proposition     If xi and x2 are both solutions of the equation

                                    i2 + bci + cz = 0,

then x = cixi + c2x2 is also a solution of this equation for any constants ci and c2.
    The next proposition is key to our method of solving equations of the form (8.4.3),
although we will leave its justification to a more advanced course. First we introduce a
definition which will make the proposition, as well as our later results, easier to state.
Definition If f and g are functions for which neither one is a constant multiple of the
other, then we say f and g are linearly independent.
Proposition Suppose x1 and x2 are linearly independent solutions of the equation

                                    2 + bc + cx = 0.

Then for any solution x, there exist constants ci and c2 such that

                                    x = cixi + c2x2.                              (8.4.4)

    The equation
                                     2 + bc + cx =0                               (8.4.5)
will have a unique solution only when we place some restrictions on x. For example, if we
specify initial conditions for both x and ch, say, x(to) x=0o and (t0) = yo, then (8.4.4) will
have a unique solution which satisfies these conditions. This statement is far from obvious,
but should appear reasonable in light of the observation, made above, that we could write
this equation as a pair of first order equations


                                     x -y                                         (8.4.6)
                                     y = -by - cx.

Hence our method of attack in solving (8.4.4) will be to first find two linearly independent
solutions, say, x1 and x2, and then find values for constants ci and c2 such that x
cizi + c2x2 satisfies the given initial conditions.
    To find two linearly independent solutions of (8.4.4), we begin with the observation
that if x satisfies this equation, then z% is equal to a sum of constant multiples of x and
cc. Hence it would be reasonable to begin with cc =6kt, for some constant k, as an initial
guess. In that case,
                                          S=kekt


﻿


Section 8.4


Second Order Linear Differential Equations


3


and
                                        2= k2ekt

so x will be a solution of (8.4.4) if and only if

                       k2ekt + bkekt + c6kt -=ekt(k2 + bk + c) = 0                (8.4.7)

for all t. Since ekt # 0 for all t, this will happen if and only if

                                    k2 + bk + c = 0.                              (8.4.8)

Hence x = ekt is a solution to (8.4.4) if and only if k is a root of (8.4.8).
Definition The equation
                                     k2 + bk + c =0                               (8.4.9)

is called the characteristic equation of the differential equation

                                    i2 + bci + czx 0.

    Since the characteristic equation is quadratic in k, its roots are given by the quadratic
formula, namely,
                                        -b - Qb2 - Ac
                                  k 2=4(8.4.10)
                                              2
and
                                        -b+    b2 -4c
                                  k2 =.                                          (8.4.11)

At this point, our search for solutions breaks into three cases, depending on whether ki
and k2 are (1) distinct real numbers (that is, b2 - 4c > 0), (2) distinct complex numbers
(that is, b2 - 4c < 0), or (3) real, but equal (that is, b2 - 4c = 0).

Case 1: Distinct real roots
Suppose ki and k2 are distinct real roots of the characteristic equation. In that case,
x1 = eklt and x2 = ek2t are linearly independent solutions of

                                     2 + bc + cx= 0

and all that remains is to find constants ci and c2 such that

                                   x =ciekit + cek2t                             (8.4.12)

satisfies the given initial conditions.

Example Consider the equation


x + x - 6x =0


﻿


4


Second Order Linear Differential Equations


Section 8.4


40


20


-3     -2


1       2       3


40


Figure 8.4.1 Solution of z% + c - 6x = 0 with x(0) = 0 and c(0)


1


with initial conditions x(0) = 0 and c(0) = 1. For the characteristic equation we have

                            0 = k2 + k - 6 = (k + 3)(k - 2).

Hence the roots of the characteristic equation are ki1= -3 and k2 = 2. Thus we must have

                                  S= ce-at + c2e2t


for some constants c1 and c2. Now

                                    3t-3cie--t + 2c262t,


so


x(0) = ci + c2


and
                                  S(0)
Hence the initial conditions imply that


                                   -3
The first equation implies ci1= -c2. Su


Hence


and


Thus


The graph of x is shown in Figure 8.4.1


-3c1 + 2c2.


Cl + c2 = 0
c1 + 2c2 = 1.
bstituting into the second equation, we have

c2 + 2c2 = 5c2.

       1

 c2 = -.
       5

       1


 1e -3t +  e2t
 5        5


﻿


Section 8.4


Second Order Linear Differential Equations


5


Case 2: Complex roots
Suppose k1 and k2 are distinct complex roots of the characteristic equation. As before,
ekit and ek2t are linearly independent solutions of

                                    a + bd + cx = 0.

However, these are complex-valued functions and for most applications we are looking for
real-valued solutions. Now if we let


                                              b
                                        p =2
                                              2
and
                                          /4c - b2
                                     q =     2    ,
                                             2
then k1 = p - qi and k2 = p + qi. Hence

                    eklt - e(p-q)t - epte-iqt - ePt(cos(qt) - i sin(qt))

and
                    ek2t - e(p+qi)t - epteigt - ept(cos(qt) + i sin(qt)).

Since these are both solutions, we know that


(8.4.13)


(8.4.14)


(8.4.15)


(8.4.16)


      1kIt  1_k2t
X1 = -e        2
      2       2


ept cos(qt)


- ept sin(qt)


(8.4.17)


(8.4.18)


and


      1__ k2 t
X2 = 2i


1_
2z


are also solutions. Then x1 and x2 are linearly independent real-valued solutions, so any
real-valued solution must be of the form


x = cixi + c2x2 = ePt(ci cos(qt) + c2 sin(qt))


(8.4.19)


for some constants ci and c2 -

Example Consider the equation

                                    2 + 2c + 5x= 0

with initial conditions x(0) = 2 and c(0) = 0. The characteristic equation is

                                    k2 + 2k + 5 =0,


which has roots


-2+ 4-20
      2


1 + 2i.


﻿


6


Second Order Linear Differential Equations

   2

   1.5

   1

   0.5


Section 8.4


3        4         5


Figure 8.4.2 Solution of z% + 2i + 5x = 0 with x(0)


2 and c(0)= 0


Hence, by (8.4.19), we must have

                             x = e-t(ci cos(2t) + c2 sin(2t))

for some constants ci and c2. Now

                 e-t(-2c1 sin(2t) + 2c2 cos(2t)) - e-t(ci cos(2t) + c2 sin(2t)),

so
                                       x(0) = ci

and
                                    ch(0) = 2c2 - c1.

Hence the initial conditions x(0) = 2 and c(0) = 0 imply that ci1= 2 and

                                      0 = 2c2 - 2.

Thus c2 = 1 and we have
                              x = e-t(2 cos (2t) + sin (2t)).

The graph of x is shown in Figure 8.4.2.

Case 3: Single real root
Suppose the characteristic equation has a single real root. In this case,


k1 = k2


b
2


(8.4.20)


For simplicity, let us call this common value k. Then x1 = ekt is a solution of the equation

                                   a + bc + cx= 0,


﻿


Section 8.4


Second Order Linear Differential Equations


7


but in order to specify all possible solutions we need to find another solution which is
linearly independent of xi. We will show that, in this case, x2 =tekt is such a solution.
Now
                              2 = ktekt + ekt= (1 + kt)ekt                      (8.4.21)
and
                        22 = (1 + kt)kekt + kekt= (2k + k2t)ekt.          (8.4.22)
Hence, remembering that k is a root of the characteristic equation (that is, k2 + bk+c = 0)
and k =-, we have

                  X2 + bz2 + cx2 =(2k + k2t)ekt + b(1 + kt)ekt + ctekt
                                 =+ek(2k+k2t+b+bkt+ct)
                                 = ekt((k 2+ bk + c)t + 2k + b)
                                 = eat(2k + b)
                                   ekt ( - 2b + b)

                                   =0

for all t. Hence x2 is another solution, clearly linearly independent of xi. Thus for any
solution x, there exist constants ci and c2 such that

                           x = cixi + c2x2 = ciekt + c2tekt.                   (8.4.23)


Example Consider the equation

                                    z% + 2ch + x - 0

with initial conditions x(0) = 10 and c(0) = -20. The characteristic equation is

                              0 = k2 + 2k + 1 = (k + 1)2

which has the single root k = -1. Hence

                                  x = cie-t + c2te-t

for some constants ci and c2. Now


so cc(0) =ci and 4(0) =-ci + c2. Hence the initial conditions x(0) =10 and 4 0) =-20
imply that ci1 10 and
                                   -20 =-10 + c2.


﻿


8


Second Order Linear Differential Equations


Section 8.4


                   10

                   8

                   6

                   4

                   2


         Figure 8.4.3 Solution of z + 2i + x = 0 with x(0) = 10 and c(0)   -20


Thus c2 = -10 and we have

                           x = 10e--t- lOte-- = 10(1 - t)e-t.

The graph of x is shown in Figure 8.4.3.

Summary
If Xi and x2 are linearly independent solutions of the equation

                                    2 + bc + cx =0,                             (8.4.24)

then any solution of (8.4.24) is of the form x = cixi + c2x2 for some constants ci and c2.
The family of all solutions x= cixi + c2x2 is called the general solution of (8.4.24). A
solution with specified values for ci and c2 is called a particular solution.
    Let ki and k2 be the roots of the characteristic equation

                                    k2 + bk + c = 0.                            (8.4.25)

If ki and k2 are real numbers with ki # k2, then the general solution of (8.4.24) is

                                  x = cie6lt + c2ek2t.                          (8.4.26)

If ki and k2 are complex numbers with ki = p - qi and k2 = p + qi, then the general
solution of (8.4.24) is
                             x - ePt(ci cos(qt) + c2 sin(qt)).                  (8.4.27)

Finally, if k = i=k2, then the general solution of (8.4.24) is

                                   x =ciekt + ctekt.                            (8.4.28)

    In the next section we will discuss the motion of a pendulum and the motion of a mass
vibrating at the end of a spring as applications of the equations considered in this section.


﻿


Section 8.4


Second Order Linear Differential Equations


9


Problems

1. Solve each of the following differential equations and plot the solution.


   (a) z +c - 2x = 0, x(0) =0, (0) = 2
   (b) z = -X, x(0) - 10, c(0) = 5
   (c) z + 3c + 2x = 0, x(0) = 1, c(0) = 0
   (d) z-4 + 4x =0, x(0) =5, c(0) =0
   (e) z - 2i + 2x   0, x(0)   10, c(0)   4
   (f) - + 2i- - 4x= 0, x(0)= 1, f (0) =0
   (g) z + 4c+ 20wx=0, x(0) =0, (0) =3
   (h) 22 + 3 - 2x = 0, x(0) =0, 4(0) =_-2
   (i) z + 6& + 9x = 0, x(0) = -6, 4(0) =4
2. Consider the equation z% + 2c - 3x = 0.
   (a) If (0) = 1, plot the solutions for x(0)
       these solutions compare?
   (b) If x(0) = 0, plot the solutions for 4(0)
       these solutions compare?
3. Consider the equation z% + 2c + lOx = 0.
   (a) If c(0) = 1, plot the solutions for x(0)
       these solutions compare?
   (b) If x(0) = 10, plot the solutions for 4(0)
       these solutions compare?
4. Consider the equation z + 4+ 4x = 0.


0, x(0)

0, 4(0)


-5, and x(0)

-2, and 4(0)


5. How do

2. How do


0, w(0)


10, and x(0)


10. How do

5. How do


0, -(0) = -5, and c(0)


(a) If 4(0) = -15, plot the solutions for x(0) = 0, x(0)
    these solutions compare?
(b) If x(0) = 10, plot the solutions for 4(0) = 0, 4(0)
    these solutions compare?


-5, and x(0)

-20, and 4(0)


= 5. How do

20. How do


5. The techniques developed in this section may be used to solve higher order homoge-
   neous linear differential equations with constant coefficients. Generalize the methods
   of this section to find the general solution for each of the following equations.
       dx      d2x   dx
     (a)2           -    -2     0
     ca)t3 +2dt2  cdt   2=
       cOw     d2x     cdx
   (b) dt3 +3dt2 +3dt +x=0
6. Show that if b and c are both positive and x is a solution of


I% + b=i + cw 0,


then lim x(t) = 0.
     t-oc


﻿


Section 8.5


               to                    Applications: Pendulums
       Differential Equations        and Mass-Spring Systems


In this section we will investigate two applications of our work in Section 8.4. First, we
will consider the motion of a pendulum, a problem originally mentioned in Section 2.2
in connection with the trigonometric functions. Second, we will discuss the motion of an
object vibrating at the end of a spring.

The motion of a pendulum
Consider a pendulum consisting of a bob of mass m at the end of a rigid rod of length
b. We will assume that the mass of the rod is negligible in comparison with the mass of
the bob. Let x(t) be the angle between the rod and the vertical at time t, with x(t) > 0
for angles measured in the counterclockwise direction and x(t) < 0 for angles measured
in the clockwise direction. See Figure 8.5.1. Suppose the bob is pulled through an angle
a and then released. That is, suppose our initial conditions are x(O) = a and (0) = 0.
If we view the motion of the pendulum in the complex plane, with the real axis vertical,
positive direction downward, and the imaginary axis horizontal, positive direction to the
right, then the position of the bob at time t is given by


z (t)


beX t


(8.5.1)


x(t)


Figure 8.5.1 A pendulum


Then we have


z = ibei e2


(8.5.2)


and


z = -bz2eix + ibzeix
    -bz2(cos(x) + i sin(x)) + ibz(cos(x) + i sin(x))
    (-bz2 cos(x) - bz sin(x)) + i(-b2 sin(x) + bz cos(x)).


(8.5.3)


1


Copyright @ by Dan Sloughter 2000


﻿


2


Applications: Pendulums and Mass-Spring Systems


Section 8.5


Now z is the acceleration of the pendulum, and so mz must be equal to the force of gravity
acting on the bob, namely, a force of magnitude mg acting in the downward direction, the
direction of the positive real axis. Hence we must have g = z, that is,

                g = (-bz2 cos(x) - bz sin(x)) + i(-bz2 sin(x) + bz cos(x)).   (8.5.4)

Equating the real and imaginary parts of the two sides of (8.5.4) gives us

                               g = -bzh2 cos(x) - bz sin(x)                       (8.5.5)

and
                               0 = -bzh2 sin(x) + bz cos(x).                      (8.5.6)

Multiplying (8.5.5) by - sin(x) and (8.5.6) by cos(x) gives us

                        -g sin(x) = bzi2 cos(x) sin(x) + bz sin2 (x)          (8.5.7)

and
                           0 = -bzh2 sin(x) cos(x) + bz cos2(x).                  (8.5.8)

Adding (8.5.7) and (8.5.8) together yields

                         -g sin(x) = bz(sin2(x) + cos2(x)) = bz.                  (8.5.9)

Thus
                                     S- -b sin(x).                               (8.5.10)

    So we have reduced the problem of describing the motion of the pendulum to the
problem of solving the second order differential equation (8.5.10) subject to the initial
conditions x(0) =a and c(0) = 0. Unfortunately, this equation is not linear. In fact, it is
not possible to find a closed form solution for this equation. In Section 8.6 we will discuss
how to study this equation using numerical approximations, but for now we will take a
different approach to finding an approximate solution. Since we know

                                    sin(x) = x + o(x)                            (8.5.11)

from our work on best affine approximations in Chapter 2, it is reasonable to replace sin(x)
by x for small values of x. Hence, if we restrict to the case where a is small, we may replace
(8.5.10) by the linear equation
                                                  z = " z.(8.5.12)
                                              b
Since this equation is homogeneous with constant coefficients, we may solve it using the
techniques of Section 8.4. Specifically, the characteristic equation for this equation is

                                       k2 + g   0,                               (8.5.13)
                                            b


﻿


Section 8.5


Applications: Pendulums and Mass-Spring Systems


3


0.2
0.15
0.1
0.05


-0.05
-0.1


. . nI_.


. . nI_.


i . . N I I I I I f I I I I I N I I I I I IF I I I I I N I I


1


2        3


4        5


-0.15
-0.2


Figure 8.5.2 Motion of a pendulum


which has roots

                                     ki =-i ~b

and

                                     k2 = i b

Hence the general solution is


                          x = ci cos    t + c2 sin (   t).

Then
                       i=-ci CbsinI9 b t + c2 C     cos (9 t),


(8.5.14)


(8.5.15).


(8.5.16)


(8.5.17)


and so x(0) = ci and 4(0)= c2 . Hence the initial conditions x(0) = a and 4(0) = 0
imply ci1= a and c2 = 0. Thus


x=acos(       t)


(8.5.18).


The graph of x for the case b = 1 meter and a= 0.1 radians, in which case we use g = 9.8
meters per second per second, is shown in Figure 8.5.2.
    One consequence of (8.5.18) is that the period of the motion, that is, the time it takes
the bob to make one complete oscillation, is


27


Vb


     b
27 -,
     g


(8.5.19)


independent of the value of a. Of course, we are working under the approximation
sin(x)   x, so (8.5.19) is actually only an approximation of the period. Nevertheless,


﻿


4


Applications: Pendulums and Mass-Spring Systems


Section 8.5


the approximation is very good for small oscillations and is the reason pendulums were
used to measure time in early clocks.

Vibrations in mechanical systems: mass-spring systems
In this example we consider the motion of an object of mass m suspended on a spring, as
shown in Figure 8.5.3. We will measure the position of the object along a vertical axis,
with the equilibrium position at 0 and the positive direction downward. Let x(t) denote
the position of the object at time t and suppose the object is released from rest at position
xo. That is, we suppose that x(0) x=o and i(0) = 0. If we ignore any damping forces,
such as resistance to the motion due to the surrounding medium, such as air or oil, then
the only forces acting on the object are the force of gravity, contributing a term of mg, and
the restorative force of the spring, given, according to Hooke's law, by kC for some constant
k > 0, where £ is the amount the spring is stretched or compressed from its natural length.
If we let At be the amount the spring is stretched when the object is at the equilibrium
position, that is, when x = 0, then at any time the spring is stretched or compressed by
x + At. Thus at any time t the force acting on the object is

                                 F =tmg - k(x + A).                            (8.5.20)


                                 -x =0


                      Figure 8.5.3 Mass on a spring at equilibrium


In particular, if the object is at rest at its equilibrium position, then both x = 0 and F = 0.
Hence
                                    0 =tmg - kAi,                              (8.5.21)
and so
                                      mrg =kAit.                                (8.5.22)
Thus (8.5.20) simplifies to F =-kx. Applying Newton's second law of motion, we have

                                                mz =-kx,(8.5.23)

from which we obtain
                                          k -x                                  (8.5.24)
                                            mr


﻿


Section 8.5


Applications: Pendulums and Mass-Spring Systems


5


                    3

                    2

                    1


                            2     4     6     8     1     1     14


                   -2

                   -3
             Figure 8.5.4 Motion of a mass-spring system without damping


This equation is of the same form as the equation derived above for approximating the
motion of a pendulum. Hence, using the same reasoning, the solution is


                                  x = xo cos(     t).                           (8.5.25)


The graph of x for k = 10, m = 5, and xo = 2 is shown in Figure 8.5.4.
    Notice that the period of the motion is

                                       2wr         m
                                 T            27r    .                         (8.5.26)
                                         k
                                         m

The frequency of the motion, that is, the number of complete oscillations in one unit of
time, is
                                          11 C
                                   f   1        /i1 .                          (8.5.27)
                                       T 2wr m
Hence for a fixed mass, increasing the spring constant, that is, increasing the stiffness of
the spring, decreases the period and increases the frequency; for a fixed spring constant,
increasing the mass increases the period and decreases the frequency.
    Now suppose there is a damping force, a force resisting the motion of the object, which
is proportional to the velocity. This adds an additional term of -eci, where c is a positive
constant, to the force acting on the object, giving us F =-kx - ch. Thus

                                   mz = -kx - ci,                               (8.5.28)

and so
                                       c.    k                                 (..9
                                  S+ -x+ -x =0(852)
                                       m     m


﻿


6


Applications: Pendulums and Mass-Spring Systems


Section 8.5


replaces (8.5.24) as the equation describing the motion of the object. To simplify the
notation, we will let

                                            2m
and

                                           a-m
                                             m


Then our differential equation becomes

                                  z + 2bi + a2x - 0,

with characteristic equation (using s for the variable)

                                   s2 + 2bs + a2 = 0.

Hence the roots of the characteristic equation are


(8.5.30)


(8.5.31)


      -2b - 4b2 - 4a2
              2


     -2b + 4b2-4a2
s2 =2


-b -   b2 -a2


-b + b2 - a2.


(8.5.32)


and


(8.5.33)

-a2 <0.


Thus the behavior of the system depends on whether b2 -
Equivalently, since


a2 > 0,b2 -a2 = 0, or b2


       -a2
b2 - a2 c2
          4m2


k
m


the behavior of the system depends on whether c2 > 4rmk, c2= 4rk, or c2 < 4rmk. In the
first case the system is said to be overdamped, in the second it is critically damped, and in
the third it is underdamped.
    First consider the overdamped case b2 - a2 > 0. In this case the characteristic equation
has distinct real roots, so the general solution is


x = cie'st + c2et2t.


= cisl si t + c s 2t
    X-C1S1 +C2S2


Now


(8.5.34)


(8.5.35)

xo and


so x(0) = ci + c2 and c(0)
c(0) = 0, give us


cisi + c2s2. Hence the initial conditions, x(0)


o =Cl + C2


and


0 = cis1 + C2s2.


﻿


Section 8.5


Applications: Pendulums and Mass-Spring Systems


7


3


2.5

  2

  1.5

  1

0.5


-0.5


2     4     6     8    10    12    14


               Figure 8.5.5 Motion of an overdamped mass-spring system


Multiplying the first equation by si and subtracting from the second gives us

                                -xOSi -C2(82 - 1)-

Hence
                                           xos1
                                   C2
                                          S2 - 81
and


Cl1o - -C2


-O(82 - S1) +-OS
  S2 - 81 82 - 81


82 8


Thus
                             x =   xe0 (st - sies2t).
                                 S2 - 81

Now b > 0 and b > /b2 - a2, so

                              s2=-b+       b2-a2<0.

Hence
                                    si <s2 <0.

It follows that es2t > esi , s2 - s1 > 0, and


(8.5.36)


(8.5.37)


                  s264t - sies8 > s2es2 - sies  = es(  - si) > 0

for all t > 0. Hence if x0 < 0, then x(t) < 0 for all t;> 0, and if x0 > 0, then x(t) > 0 for
all t > 0 . Combining this with


lim x(t) = 0,
t-oo


(8.5.38)


we see that in this case the system does not oscillate at all. After release, the object simply
returns to the equilibrium position. Figure 8.5.5 shows this behavior for k = 10, m = 5,
c = 20, and x0 = 2.


﻿


8


Applications: Pendulums and Mass-Spring Systems


Section 8.5


                     3

                     2.5

                     2

                     1.5

                     1

                     0.5

                             2     4     6     8    10    12    14
                   -0.5

             Figure 8.5.6 Motion of a critically damped mass-spring system


    Next consider the case when b2 - a2 = 0. In this case the characteristic equation has
only one real root, si = s2 = -b, so the general solution is

                                 x - cie-bt + c2te-bt.                         (8.5.39)

Then
                           x   -bc1ebt - bc2te-bt + c2e bt                      (8.5.40)

so x(0) = ci and c(0) = -bc1 + c2. Hence the initial conditions, x(0) x=0o and 4(0) = 0,
give us c x=0o and c2 = bxo. Thus

                        x - xoevbt + bxote-bt     e-bt(l + bt).                 (8.5.41)

Equivalently, since b = 2,
                               S= woe-2mi 1+        t).                         (8.5.42)

Now for any t;> 0,
                                          c
                                     1+   -t>0.
                                         2mn
Hence, as in the overdamped case, the system does not oscillate. Once released, the object
moves back to the equilibrium position without ever crossing it. Figure 8.5.6 shows this
behavior for k = 10, m = 5, c = 10v/2, and x0= 2. This motion is said to be critically
damped because any increase in c results in overdamped motion, while any decrease in c
results in underdamped motion, which we consider next.
    Finally, consider the case when b2 - a2 < 0. The roots of the characteristic equation
are now
                           s1=-b- |b2 - a2 - -b -i fa2 - b2                     (8.5.43)

and


s2 =-b+  b2 -a2 = -b+i a2 -b2


(8.5.44)


﻿


Section 8.5            Applications: Pendulums and Mass-Spring Systems             9

If we let a = /a2 - b2, then the general solution is

                           x = e-bt(ci cos(at) + c2 sin(at)).                 (8.5.45)

Then

      S    e-bt(-aci sin(at) + ac2 cos(at)) - be-bt(ci cos(at) + c2 sin(at)), (8.5.46)

so x(O) = ci and  (0) = ac2 - bci. Hence the initial conditions, x(O) x=0o and c(0) = 0,
imply that ci = x0 and
                                           bxo
                                      C2-

Thus

       x = e-bt(otcos(at) +     sin(at))a e-bt(a cos(at) + bsin(at)).      (8.5.47)

This expression simplifies somewhat if we introduce the angle

                                   O =tan-1 ().                               (8.5.48)

Then
                                  cos(0) =    a
                                            a2 + b2
and
                                 sin(O) =          .

Moreover, since a =/a2 - b2,


                         a2 + b2    /(a2 -b2) + b2 = a =  -.

Hence
                  Xo a2 + b2 6-bt (    a                        .cos(t) + sin(at)
               x   n         e              cos, at)\+/        sm\at)
               = zo _ bt (cos(O) cos(at) + sin(O) sin(at)).
                  a Vm
Using the angle subtraction formula for cosine, this becomes

                                      -e-bt cos(at -O).                       (8.5.49)
                                  a Vm

The presence of the cosine factor in this expression shows us that, even though we still
have
                                     lim cc(t) =0,
                                     t-aoo


﻿


10


Applications: Pendulums and Mass-Spring Systems


Section 8.5


                     3
                     2.5
                     2
                     1.5
                     1
                     0.5

                            2     4     6     8     10    12    14
                  -0.5
                    -1

              Figure 8.5.7 Motion of an underdamped mass-spring system


the underdamped mass-spring system will oscillate about the equilibrium position with a
decreasing amplitude of

                                     xo     e -bt                              (8.5.50)
                                     a m
Figure 8.5.7 shows this behavior for k = 10, m = 5, c = 5, and x0= 2.

Problems

1. In an experiment to determine g, a pendulum of length 50 centimeters is observed to
    have a period of oscillation of 1.42 seconds. Approximate g based on this observation.
 2. The period of oscillation of a pendulum of length b given in (8.5.19) is, as mentioned,
    only an approximation of the true period. It can be shown that the true period of a
    pendulum released from an angle a is given by

                                    bf2          1
                            T = 4   -                     d#,
                                    go       1 - k2 sin2 (o)

    where 0 < a < r and k = sin (2).
    (a) Find the period of oscillation for a pendulum of length 50 centimeters for a = ,
        a = j, a = 5,and a =10. Compare these results with the approximation given
        in (8.5.19).
    (b) Graph T as a function of a for -} 1a < {. For comparison, also plot the
       horizontal line


 3. Consider a mass-spring system with xo= 10, ch(0)= 0, k= 10, and m= 10. Plot
    x(t) for c =0, c =5, c =10, c =20, c =25, and c =30. Identify each motion as
    overdamped, critically damped, underdamped, or undamped.


﻿


Section 8.5


Applications: Pendulums and Mass-Spring Systems


11


4. Consider a mass-spring system with xo = 10, c(0) = 0, m = 10, and c = 20. Plot x(t)
   for k = 2, k = 5, k = 10, and k = 15. Identify each motion as overdamped, critically
   damped, underdamped, or undamped.

5. Consider the underdamped motion of a mass-spring system expressed in (8.5.49).
   (a) Show that the maximum values of x(t) occur at t = 0, T, 2T, ..., where

                                              27
                                     T =.
                                             k    c2
                                             m 4m2

       Note that when c = 0, T reduces to the period of the motion for the mass-spring
       system without damping.
   (b) Show that if x1 and x2 are two successive maximum values of x(t), then

                                            X1 cT
                                          X2

6. Inside the earth, the force of gravity acting on an object is proportional to the distance
   between the object and the center of the earth.

   (a) Suppose a hole is drilled through the earth from pole to pole and a rock is dropped
       into the hole. If x(t) is the distance from the object to the center of the earth at
       time t, show that, ignoring any resistive forces,


                                     z = R cos    gt),


       where R is the radius of the earth.
   (b) How long, in minutes, does it take for the rock to make one complete trip from
       pole to pole and back? Use R = 3950 miles.
   (c) What is the velocity of the rock, in miles per hour, when it reaches the center of
       the earth?


﻿


Section 8.6


The
The


Geometry of Solutions:
Phase Plane


As mentioned in Section 8.4, we may represent a second order differential equation


= f (x, , t)


(8.6.1)


as a system of two first order equations


_z=zy
  Q=f (x, y, t).


(8.6.2)


More generally, if g and f are functions of x, y, and t, we may consider a system of
equations


  g(x, y, t)
Q=f (x, y, t),


(8.6.3)


of which (8.6.2) is a particular case when g(x, y, t)


y. In this section we shall consider


the behavior of solutions to such systems of equations, paying particular attention to those
arising in the manner of (8.6.2).


Definition


Suppose x(t) and y(t) are solutions of the system


                                        g(x, y, t)
                                      Q=f(x, y,t)

for t in an interval [a, b]. The curve in the plane with coordinates (x(t), y(t)), a < t < b, is
called a phase curve of the system. The plane in which the phase curve is plotted is called
the phase plane of the system.

    Note that if the system of equations arises from a second order differential equation,
then a phase curve is a plot of f(t) versus x(t). In many common cases, this is a plot of
velocity versus position.


Definition
system


Suppose the constant functions x(t)


xo and y(t)


yo is a solution of the


                                        g(x, y, t)
                                        f(x, y, t)
Then the point (xo, yo) is called a stationary point of the system.


1


Copyright @ by Dan Sloughter 2000


﻿


2


The Geometry of Solutions: The Phase Plane


Section 8.6


    If (zo, yo) is a stationary point, then the phase curve of the solution

                                        x(t) = zo
                                        y(t) = Yo

consists of only the single point (o, yo). That is, if (xo, yo) is a stationary point and the
system has initial conditions x(to) x=0o and y(to) =_Yo, then the system will remain at
the point (xo, Yo) for all time. Moreover, note that for this solution

                                        4(t) = 0

and
                                        y(t) =0
for all t. Since we must have
                                        = g(x,y,t)
                                        = f(x,y,t),
it follows that stationary points are precisely the points (x0, Yo) for which

                                     g(co, yo, t) = 0

and
                                     f(xo,yo,t)  0
for all t.
Example Consider the second order linear equation

                                           k
                                      2 + -xk 0,
                                          m

where k and m are positive constants. In Section 8.5 we saw how this equation models an
undamped mass-spring system consisting of an object of mass m oscillating at the end of
a spring with spring constant k. This equation is equivalent to the system of equations


                                              k                                   (8.6.4)
                                         y=--x.
                                             m

Clearly, the only stationary point of this system is (0, 0), corresponding to the object being
at rest at the equilibrium position of the system. With initial conditions cc(0) =x0 and
y(0) =0, this system has solution


                                x =xo cos(1 -t)


                                y =X-o - siny -) .


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


3


Figure 8.6.1 A phase curve for the system ch


y, y


-2x


    A plot of the phase curve for this solution is shown in Figure 8.6.1 for k = 10, m = 5,
and xo = 2 for 0 < t <   27r (that is, for one full period of the motion). You should
compare this plot with the graph of x in Figure 8.5.4. The arrows on the curve point in
the direction of increasing t. At t = 0, the mass is released from a point 2 units below the
equilibrium position, hence x = 2 and y = 0; as t increases from 0 to v , 7  decreases
from 2 to 0 as the mass moves upward to the equilibrium position while y, the velocity,
decreases from 0 to -22; as t increases from 4J- to  , continues to decrease from0
to -2 as the mass moves to its highest point, at which point its velocity is y = 0; at this
time, the velocity becomes positive and the mass moves from -2, through the equilibrium
position, back to 2, at which point the velocity is again 0 and the motion begins all over
again. Notice that the phase curve is a closed curve because the motion is periodic: after a
period of /27 units of time, both the position and the velocity of the object have returned
to their original values. Moreover, the stationary point (0, 0) is at the center of this phase
curve. In fact, all the phase curves for this equation are closed curves about the stationary
point. Such a stationary point is called a center.
    Note that in this example the phase curves are all ellipses. The curve in Figure 8.6.1
satisfies


x2   y2
   +
4     8


1.


(8.6.5)


Example Consider the second order linear equation
                                       c      k
                                     +     +    x   0,
                                       m      m


﻿


4


The Geometry of Solutions: The Phase Plane


Section 8.6


1


1


0.5     1     1.5


2.5


1.5


-2


Figure 8.6.2 A phase curve for the system ch


y, y


- 2x


where c, k, and m are positive constants. We saw in Section 8.5 that this equation models
the motion of a damped mass-spring system consisting of an object of mass m attached
to a spring with spring constant k and moving through a medium offering a resistive force
proportional to zc. This second order equation is equivalent to the system


                                          c     k                                (8.6.6)
                                      y = --y- -xc.
                                         m      m

As in the previous example, the only stationary point of this system is (0, 0). We will
consider an example of the underdamped case, namely, k = 10, m = 5, and c = 5. In that
case, with initial conditions x(0) = zo and y(O) = 0, the solution of (8.6.6) is


x = 2  x o e - c o s (t-
           7          5 2
  y- -zoe-2 (sinm -7t


0)


   Vo)+7'cos ( 27


where
                                   S=Otan-1( 7).

A plot of the phase curve for this solution is shown in Figure 8.6.2 for zo = 2 with
0 < t < 20. You should compare this plot with the graph of x in Figure 8.5.7. Here we
see that the phase curve is not closed and the motion is not periodic; as t increases, the
curve spirals in toward the stationary point (0, 0). This is in fact the general behavior


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


5


of phase curves for this system: No matter what the initial condition, as t increases,
both the position and velocity functions decay toward 0 as the mass performs smaller and
smaller oscillations about the equilibrium. In this case we call the stationary point a stable
equilibrium. In general, a stationary point (x0, Yo) is a stable equilibrium if for any initial
conditions sufficiently close to (x0, Yo), the resulting phase curve approaches (xo, yo) in the
limit as t - oc. In this example, every phase curve approaches (0, 0) as t - oc.

    A stationary point (x0, Yo) is called an unstable equilibrium if there is a fixed distance
d such that it is possible to find initial conditions arbitrarily close to (x0, Yo) for which the
resulting phase curve will eventually be farther than d away from (xo, yo).

Example     Taking k = 10, m = 5, and c = -5 in the system (8.6.6) would lead to the
solution
                          2     t    _
                  x =2    -xoe2cos      t-O)
                          7         \2     /

                  y = dz2_oes    -cos     -t -8 - sin     -t -0      ,
                                  S2                      2

where
                                  8 =Otan-1      1.

In this case, the stationary point (0, 0) is an unstable equilibrium because, since e2 increases
with t, the phase curves spiral away from the stationary point (0,0). A plot of the phase
curve for this solution is shown in Figure 8.6.3 for x0 = 0.01 with 0 < t < 10.

    We will see another example of an unstable equilibrium when we return to the pendu-
lum example below.

Numerical approximations
Although useful in general, the ideas developed above are most helpful when exact solutions
are not available and we must rely upon numerical approximations in order to understand
the behavior of solutions. However, before we can consider such examples, we must first
modify our numerical techniques from Section 8.1 to the current situation. Since the second
order Runge-Kutta method is more accurate than Euler's method, we will discuss only the
modification of the former.
    Suppose we wish to approximate the solution of the system

                                      - g(x, y,t)(867
                                      - f (x, y, t),

with initial conditions cc(to) =x0 and y(to) =yo, at time to + h. First we approximate x
at to + ) by xo + mn1, where

                                   h.        h
                             mni = -(to) =    g(xo, yo, to),                     (8.6.8)


﻿


6


The Geometry of Solutions: The Phase Plane


            1


Section 8.6


0.2   0.4   0.6   0.8    1


-0.5


-1


             Figure 8.6.3 A phase curve for the system i =y, y  y - 2x


and y at to + 2 by yo + m2, where

                                h.       h
                           m2 = -9(to) = -f(wo,yo,to).                    (8.6.9)

Then the second order Runge-Kutta approximations to x(to + h) and y(to + h) are given
by
                      Xi= zo + hg (zo + m1, yo + m2, to+)                (8.6.10)

and
                                                       h
                       J1 = yo + h f (zo + m1, yo + m2, to +  ,          (8.6.11)


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


7


respectively. As before, to approximate the solution over an interval [to, t1], we iterate the
above process as many times as necessary.
Second order Runge-Kutta To approximate the solution of the system of equations

                                        g(x, y,t)
                                      y  f(x,y,t)

with initial conditions x(to) = xo and y(to) = yo on an interval [to, ti], choose a small value
for h > 0 and an integer n such that to + nh > t1. Letting sk = to + kh, compute

                                       h
                                 m1 =h-g(xk,yk,sk)
                                       h                                      (8.6.12)
                                 m2 = -f(xk,yk,sk)

and

                                                            )                 (8.6.13)
                      Yk+1=Yk+hfxk+T1,Yk+T2, k+)

for k = 0,1,2,... ,n - 1. Then ok is an approximation for x(to + kh) and Yk is an
approximation for y(to + kh).
Example In this example we consider a simple case for modeling a predator-prey envi-
ronment. Suppose animals of species A prey on animals of species B. For our example,
species A will be foxes and species B will be rabbits, although they could be any two
species that have the predator-prey relationship we are about to describe. We assume
that the food supply of the rabbits is essentially unlimited and the foxes are their only
natural enemy in the given environment, while, on the other hand, we assume the foxes
are dependent upon the rabbits for the bulk of their food supply. We also assume that the
foxes have no natural enemies. Let y(t) be the size of the fox population and let x(t) be
the size of the rabbit population at time t. If there were no foxes, the rabbits would enjoy
uninhibited growth and we would have

                                       x = ax

for some constant a > 0 representing the natural growth rate of rabbits in the given
environment. However, if we assume that the number of encounters between rabbits and
foxes is proportional to the product of the two populations and, furthermore, that a certain
proportion of these encounters results in a rabbit becoming a meal for a fox, then ch will
be decreased by an amount f#xy for some constant /3> 0. Hence we have

                             x ax - /3xy =x(a - /3y).

At the same time, if there were no rabbits, the fox population would decline for want of
food, that is, we would expect


﻿


8


The Geometry of Solutions: The Phase Plane


Section 8.6


for some constant 7 > 0, while, if there are rabbits, encounters between rabbits and
foxes contributes positively to the growth of the fox population. Thus y, the size of the
fox population, should make a negative contribution to y and xy should make a positive
contribution to y. This leads us to suppose

                             y= -'y + oxy = -Y(7 - ox)

for some constants 7y> 0 and 6 > 0. Hence we now have a system of first order equations

                                      ic=c(a - /3y)
                                           X(a O )(8.6.14)
                                      y=-y(7 - Ox),

where a, 0, -y, and 6 are all positive constants.
    The stationary points for this system are solutions of

                                    0 =x(a - /3y)
                                    0 =-y( - Ox).

Clearly, x = 0, y = 0 is one solution. If x # 0, then, from the first equation, we must have

                                      0=a- Sy,

and so


Thus y # 0, so, from the second equation, we must have

                                      0=7 - Ox,

from which we find


Hence the system (8.6.14) has two stationary points: (0, 0) and (   The first corre-
sponds to the uninteresting situation when there are no foxes and no rabbits; the second
to an equilibrium condition in which the populations are in balance.
    For example, consider the case with parameters a = 0.06, 0 = 0.0008, y= 0.2, and
6 = 0.0008, corresponding, in part, to a natural growth rate of 6% per year for the rabbits
and a decay rate, in the absence of any rabbits, of 20% per year for the foxes. The system
(8.6.14) then becomes
                                i   x(0.06 - 0.0008y)
                                  y=-y(0.2 - 0.0008w),                          (..5
with nonzero stationary point

                                  (0~ 0.008 ) (250, 75).


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


9


Hence a population of 250 rabbits and 75 foxes would be in equilibrium and would not
change over time; the natural yearly increase in the rabbit population is accounted for
exactly by the appetite of the foxes. To see what happens in other cases, suppose we start
with an initial population of xo = 400 rabbits and yo = 50 foxes. We will approximate the
solution to (8.6.15) over the interval [0, 150] using the second order Runge-Kutta method
with a step size of h = 0.05. To start, using

                            g(x, y, t) = x(0.06 - 0.0008y)
                            f (x, y, t) = -y(0.2 - 0.0008x),

we compute

                   h
             mh= --g(wo, yo, to) = (0.025)(400)(0.06 - (0.0008)(50)) = 0.2
                   2

             m2 = -f(o,yo,to) = -(0.025)(50)(0.2 - (0.0008)(400)) = 0.15
                   2

and
                    x1 =o + hgo +mi, yo +m2,to + -

                       = 400 + (0.05)g(400.2, 50.15, 0.025)
                       = 400 + (0.05) (400.2) (0.06 - (0.0008) (50.15))
                       = 400.3978

                    y1 =yo + hf (o + n1, Yo + n2, to +
                       = 50 + (0.05)f (400.2, 50.15, 0.025)
                       = 50 - (0.05) (50.15) (0.2 - (0.0008) (400.2))
                       = 50.3013,
where we have rounded x1 and y1 to four decimal places. Then x1 is an approximation for
x(0.05) and y1 is an approximation for y(0.05). In general, X20t and Y2ot are approximations
for x(t) and y(t) when 20t is an integer. Table 8.6.1 gives our results for t = 0, 5, 10, ... , 150,
where we have rounded the values to the nearest integer.
    Notice the cyclic nature of both x and y. In the early years, the population of foxes
increases due to the plentiful supply of rabbits for food. However, eventually (sometime
between 15 and 20 years) the increasing fox population causes a decrease in the rabbit
population to the point where the population of foxes begins to decline. As the fox popu-
lation declines, there comes a point (between 30 and 35 years) when the rabbit population
begins to increase, which in turn eventually leads to an increase in the fox population,
starting sometime between 55 and 60 years. At this point, the cycle begins again. This
behavior is most evident in Figure 8.6.4, where the numerical solutions for w and y have
been plotted over the interval [0, 1501. Notice how the periods of the two curves are the
same, but their phases are different. This phase difference occurs because, for example,
a decrease in the rabbit population does not lead to an immediate decrease in the fox
population; in fact, the fox population will continue to grow until the rabbit population


﻿


10


The Geometry of Solutions: The Phase Plane


Section 8.6


                              t            X20t           Y20t
                              0            400             50
                              5            408             95
                              10           332            158
                              15           225            175
                              20            161           137
                              25            138            91
                              30            139            58
                              35            156            38
                              40            185            28
                              45           226             23
                              50           278             23
                              55           339             29
                              60           395             47
                              65           411             89
                              70           343            152
                              75           235            177
                              80            165           142
                              85            139            95
                              90            138            61
                              95            153            40
                              100           181            28
                              105          221             23
                              110          272             23
                              115          333             28
                              120          390             44
                              125          413             83
                              130          354            145
                              135          245            177
                              140           170           147
                              145           141           100
                              150           138            64

                         Table 8.6.1 Predator-prey poplulations

is too small to support its growth, and it is at that point that the fox population begins
to decline. The phase curve for this solution is shown in Figure 8.6.5. Here the fact that
the phase curve is a closed curve reveals the periodic nature of the solution. Note that the
phase curve encloses the nonzero stationary point (250, 75). This point is in fact a center.
Figure 8.6.6 shows several phase curves, all of which are closed curves about (250, 75). We
have omitted arrows on the phase curves in Figure 8.6.6, but the direction of increasing t
is counter-clockwise, as it was in Figure 8.6.5.

Example For our final example in this section, we return to the pendulum problem
discussed in Section 8.5. Suppose our pendulum consists of a bob of mass m at the end
of a rigid rod of length b. We will assume that the upper end of the rod is attached to


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


11


           400


           300


           200


           100


                    20     40    60    80    100    120   140

          Figure 8.6.4 Predator-prey populations of Table 8.6.1


          200

          175

          150

          125

          100

            75

            50

            25


                       100      200       300      400      500

Figure 8.6.5 Phase curve for the predator-prey populations in Table 8.6.1


           300

           250

           200

           150

           100


                      100    200     300     400     500    600

     Figure 8.6.6 Phase curves for the predator-prey system (8.6.15)


﻿


12


The Geometry of Solutions: The Phase Plane


Section 8.6


                           7.  -      2.         2.     5     7.


            Figure 8.6.7 Phase curves for a pendulum: i= y, y  -4.9 sin(x)


another rod, held fixed and perpendicular to the plane of motion of the pendulum, in such
a way that the pendulum is free to move through complete circles about this axis. If we
let x (t) be the angle between the rod and the vertical at time t, then we showed in Section
8.5 that x must satisfy the equation

                                           g.
                                     z=-     sin(x).
                                           b

Equivalently, as a system of first order equations, we have


                                           gs     )                             (8.6.16)
                                           -- sin(cc).
                                           b

The stationary points for this system are the points (x, y) where y = 0 and sin(x) = 0.
Hence there are an infinite number of stationary points, namely, (nr, 0) for n = 0, 1, 2, ....
Note that for even values of n, the stationary points (nr, 0) correspond to the pendulum
hanging at rest with the bob end down. We should expect these stationary points to be
centers since, without friction, any nearby initial conditions would result in the pendulum
oscillating about the given stationary point. For odd values of n, the stationary points
(nr, 0) correspond to the pendulum balancing with the bob end up. We should expect that
any initial condition near one of these stationary points would result in motion away from
the given stationary point. That is, any slight motion away from the balanced position
should cause the pendulum to fall and begin an oscillatory motion. Hence these stationary
points should be unstable equilibriums. A look at the phase curves in Figure 8.6.7, shown
for a pendulum of length 2 meters, supports these statements: For any integer k, (2kw, 0)
is a center and ((2k - 1)w, 0) is an unstable equilibrium. We have again omitted arrows
on the phase curves in Figure 8.6.7, but the direction of increasing t is from left to right
above the c-axis and from right to left below the c-axis.


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


13


Problems

1. For each of the following differential equations, find the general solution and then plot
    the phase curves of the solutions for the given initial conditions over the given time
    interval I. For each equation decide whether the stationary point (0, 0) is a center, a
    stable equilibrium, or an unstable equilibrium.


()z+ x = 0            I = [027r]


   ()z+ 3,- + 2x = 0  I = [-2, 2]


Initial conditions:


Initial conditions:


Initial conditions:


Initial conditions:


Initial conditions:


(c) z - x   0


I = [-2, 2]


x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)
x(0)


-2, ch(0) - 0
-1, ch(0) - 0
-0.5, ch(0) = 0
0.5, ch(0) = 0
1, 4(0) = 0
2, (0) - 0
0, (0) -0.2
0, (0) - 0.2
-0.2, 40) = 0
0.2, 40) = 0
0, 4(0) - 1
0, 40) --1
-1, 40) = 0
1, (0) - 0
0, (0) - 1
0, 40) --1
-1, 40) = 0
1, 4(0) = 0


1, 4(0)
2, 4(0)
3, 4(0)
4, 4(0)
5, 4(0)


0
0
0
0
0


(d) z +2c+2x=0 I


(e) z - 2c+2x =0 I


2. Plot the phase curves for the examples of overdamped and
   spring systems given in Section 8.5.
3. Consider the equation of motion for a mass-spring system

                                        c     k
                                        m     m
   with initial conditions x(0) = 10 and 4(0) = 0.


critically damped mass-


(a) Suppose k = 10 and m = 10. Plot the phase curves for the solutions with c =
    c = 5, c = 10, c = 20, c = 25, and c = 30. Compare your results with your plots
    x(t) from Problem 3 of Section 8.5.


0,
of


(b) Suppose m = 10 and c=
    k =5, k =10, and k -
    Problem 4 of Section 8.5.


20. Plot the phase curves for the solutions with k = 2,
15. Compare your results with your plots of x(t) from


﻿


14


The Geometry of Solutions: The Phase Plane


Section 8.6


4. For each of the following second order differential equations, write the equation as
   a system of first order equations and approximate the solution for the given initial
   conditions over the interval I using the second order Runge-Kutta method with step
   size h. Plot both x(t) and the corresponding phase curve.


(a)   = -X2
(b)   +c=


(c)
(d)
(e)
(f)


z+ x3 =2
z + (X _
z - (x2 _-


x(0) =-2, c(0)
- sin(x), x(0) = -3,
0, x(0) = 2, 4 (0) =
_ 1)c + x 0, x(0):
_ 1)- + x  0, x(0):
0, x(0) = 2, 4 (0) =


4, I   [0,3], h =0.01
(0)     2, I= [0,10], h =0.02
0, I = [0, 4], h = 0.02
= 0.5, (0) = 0, I = [0,20], h =
- -2, (0) = 0, I = [0,20], h =
0, I = [0, 10], h = 0.05


0.01
0.01


5. For each of the following systems of first order differential equations, approximate
   the solution for the given initial conditions over the interval I using the second order
   Runge-Kutta method with step size h. Plot x(t), y(t), and the corresponding phase
   curve.


(a)    ±

(b)    ±

(c)    ±


2xy
y2 - x2
x(1 - y2)
-y(1 - x2)
-y + x(1 - x2
x + y(1 - x2 -


       x(0)
       y(0)
       x(0)
       y(0)
_y2) x(0)
y2)  y(0)


0.05
0.5
3
2.5
0
4


I = [0,10]

I = [0,10]

I = [0, 20]


h =0.05

h =0.02

h =0.04


6. Consider the predator-prey model


  X(a - /O1)
~-Y7 - 6x)


where x is the size of the prey population, y is the size of the predator
a, 0, -y, and 6 are nonnegative constants.


population, and


(a) Find explicit solutions for x and y if a and y are both positive, but , = 8 = 0.
    Describe the behavior of the solutions in this case.
(b) Suppose a = 0.05, /3= 0.001, y = 0.25, and 8 = 0.0005. Using the initial conditions
    x(0) = 700 and y(O) = 50, plot x and y over an interval of time long enough to
    capture at least two periods (use a step size of h = 0.05). Plot the corresponding
    phase curve. What is the nonzero stationary point in this case?
(c) For the solution found in (b), what are the maximum and minimum predator
    populations? What are the corresponding prey populations?
(d) For the solution found in (b), what are the maximum and minimum prey popula-
    tions? What are the corresponding predator populations?
(e) Plot four more phase curves using the parameters specified in (b) with varying
    initial conditions, plotting two inside and two outside the phase curve plotted in
    (b). Be sure to plot a complete cycle in each case.


﻿


Section 8.6


The Geometry of Solutions: The Phase Plane


15


7. Consider the motion of a pendulum as described by the equation

                                           -- sin(x)
                                           b

   as in the final example of the section. Use the second order Runge-Kutta method to
   approximate x for a pendulum of length 2 meters over the interval [0, 10] using the
   initial conditions x(0) = 1 and 4(0) = 0 and a step size of h = 0.05. Graph x(t) and
   use your results to estimate the period of x(t). How does your estimate compare with
   the period of the linearized system

                                              g
                                              b

   that we considered in Section 8.5?
8. For c > 0, consider the equation

                                       =- sin(x) - ci,

   the equation for the motion of a pendulum of length b with a damping force propor-
   tional to its angular velocity. Suppose b = 2 meters and c = 0.8.
   (a) Write this equation as a system of first order equations. What are the stationary
       points of this system? Which stationary points do you expect to be stable equilib-
       riums? Which stationary points do you expect to be unstable equilibriums? Which
       stationary points do you expect to be centers?
   (b) Plot phase curves corresponding to the initial conditions x(0) = 0 and, in turn,
       4(0)   -20, 4(0)  -15, 4(0)  -10, 4(0)  -5, 4 (0) = 5, 4 (0) = 10, 4 (0) = 15,
       and 4(0) = 20. Describe the behavior of the pendulum for each of these curves.
   (c) Plot phase curves corresponding to the initial conditions x(0) = 0 and, in turn,
       4 (0) = 6.0, 4 (0) = 6.2, 4 (0) = 6.4, 4 (0) = 6.6, 4 (0) = 6.8, and 4(0) = 7.0. Describe
       the behavior of the pendulum for each of these curves.
   (d) Do your results in (b) and (c) agree with your expectations from (a)?
9. Consider the equation
                                 z + aih - x(1 - X 2)  0,

   where a is a constant.

   (a) Write this equation as a system of first order equations and find all the stationary
       points.
   (b) Let a =1. Plot enough phase curves to convince yourself of the proper classifica-
       tion of the stationary points found in (a).
   (c) Let a =-1. Plot enough phase curves to convince yourself of the proper classifi-
       cation of the stationary points found in (a). How do your answers compare with
       your results in (b)?


﻿


Section 8.7


       Differential Equations         Power Series Solutions


In this section we consider one more approach to finding solutions, or approximate so-
lutions, to differential equations. Although the method may be applied to first order
equations, our discussion will center on second order equations.
    The idea is simple: Assuming that the equation

                                    = f (x, , t)                               (8.7.1)

has a solution which is analytic on an interval about t = to, we express x as a power series

                                       00
                                x(t) =    a,(t - to)",                         (8.7.2)
                                      n=0

compute i and z, substitute the results into the equation, solve for the coefficients ao, ai,
a2, ..., and verify that the resulting series converges on an interval about to. As we shall
see, in practice the difficult part is solving for the coefficients. This method will lead us to a
closed form solution for the equation only in the rare case that we are able to recognize the
resulting power series as the Taylor series of some known function. One advantage of this
technique over numerical methods, such as the Runge-Kutta method, is that we are able
to work with general solutions and equations involving unspecified parameters, whereas
with a numerical method every quantity must be specified as a number. The disadvantage
of this technique is that it is not as widely applicable, due to the difficulty of solving for
the coefficients, and, when numerical results are needed, one must still approximate the
infinite series which results when evaluating x at a point.
    To illustrate the procedure, we will begin with an example which we know to be solvable
by the techniques of Section 8.4.

Example Consider the equation
                                       z   -x.                                 (8.7.3)

This is a constant coefficient homogeneous linear equation with characteristic equation
k2 + 1 = 0. Since the roots of the characteristic equation are -i and i, we know from our
work in Section 8.4 that the general solution of this equation is

                               x = ci cos(t) + c2 sin(t),

where ci and c2 are arbitrary constants.


1


Copyright @ by Dan Sloughter 2000


﻿


2


Power Series Solutions


Section 8.7


    We may obtain the same result using power series. If we suppose that x is analytic on
an interval about t = 0, then we may write

                                            00
                                    x(t) =     ant"
                                           n=0

for some constants ao, a1, a2, .... Differentiating, we have


00
= nant
n=1


-1


   (n + 1)an+1t"
n=0


and

                 z(t) Zn(n -
                        n=2
Substituting into (8.7.3) gives us


            00
1)antn--2=     (r + 2)(n + 1)an+2t".
            n=0


00
   (n + 2)(n + 1)an+2t"
n= 0


00
    an t.
n=0


Since power series representations are unique, the coefficient of to in the power series on
the left must equal the coefficient of to in the power series on the right for all values of n.
That is, we must have
                               (n + 2)(n + 1)an+2 = -an
for n = 0, 1, 2,.... Hence the coefficients of the power series representation of x satisfy the
difference equation


an+2 -_        an
          (n + 2)(n + 1)


(8.7.4)


for n = 0, 1, 2,.... Note that (8.7.4) does not restrict either ao or a1, but determines all of
the other coefficients once these values are specified. Thus, given any values for ao and a1,


         ao
a2     (2)(1)
         al
a3     (3)(2)
         a2
a4   -__
       (4)(3)
         a3
       (5)(4)
         a4
a6     (6)(5)
         a5
a7     (7)(6)


  ao
  2'
  al
  3!'
  ao        ao
(4)(3)(2)   4!'

(5)(4)(3)(2)   5!'
        ao
  (6)(5)(4)(3)(2)
         a1
  (7)(6)(5)(4)(3)(2)


ao
6!'
   a1
   7!'


﻿


Section 8.7                          Power Series Solutions                          3

and so on. In fact, we see that for k = 0, 1, 2, ...,

                                           (-1)kao
                                    a2k = (2k)!

and
                                           (-1)ka1
                                  a2k+1 = (2k + 1)!

In most cases, this is as far as we can go; we would now check for the interval of convergence
of the resulting power series and conclude that x is a solution of (8.7.3) on that interval.
However, in this case we see that
                  00
             x = =   ant"
                 n=0
               = a o2at - a        at2 _a1t+ a0t4+ a1t5 _ a0t6- a17
                            2      3!     4!      5!     6!     7!
                         t2   t4   t6                  t3   t5   t7
                =ao  1-     +    -    +---+ai t-         +     -    +---
                          2   4!   6!                  3!   5!   7!
               = ao cos(t) + a1 sin(t),

the general solution that we noted above. Hence there is no need to check for the interval
of convergence since we recognize our power series representation of x as the Taylor series
of a familiar function.
    In general, if
                                       00
                                  x =   3 an(t - to)",                           (8.7.5)
                                      n=0
then x(to) = ao and c(to) = a1. Hence if we are seeking the solution of a differential
equation in this form, then the values of ao and a1 are determined by any initial conditions
which specify x(to) and (t0). Thus we shall see that all of our examples will be of the
general form of the previous example. Namely, after substituting x, c, and z into the
equation, we will find a difference equation which determines the coefficients, a2, a3, a4,
..., in terms of ao and a1. However, unlike the first example, our remaining examples
will not result in closed form expressions for our solutions. Nevertheless, we will find
power series representations for the solutions which may be used to approximate a specific
solution to any desired order on some interval of convergence.

Example Consider the equation

                                       z- tz   0.                                (8.7.6)

Suppose x is analytic on an interval about t =0 and write
                                          00
                                     x =     amt"
                                         n=0


﻿


4


Power Series Solutions


Section 8.7


for some constants ao, a1, a2, .... Then, as in the previous example,


     00
=       nant-
    n=1


   (n + 1)an+1t"
n=0


and


X       n(n
    n=2


l a to-2


   (n + 2)(n + 1)a±+2t".
n=0


Substituting into (8.7.6) gives us

                        00                          00
                           (n + 2)(n + 1)an+2t" - t3 ant = 0,
                       f0o w h w0

from which it follows that


   (n + 2)(n + 1)a±+2t"
n=o


00
    antn+1
n=0


Since the powers of t in the series on the left begin with 0 while that the powers of t in
the series on the right begin with 1, we will move the constant term of the series on the
left out of the summation and adjust the index of the sum on the right so that it agrees
with the index of the sum on the left. We then have


       00
2a2 +    (n + 2)(n + 1)an+2t"
      n=1i


    an_1tn.
n=1


We can now use the uniqueness of power series representations
on the two sides of this equation, giving us


to equate the coefficients


2a2 = 0


and, for n = 1, 2, 3, . ..,


(n + 2)(n + 1)an+2 =-an_1.


Hence the coefficients of the power series for x are specified by

                                        a2 = 0


and the difference equation


    an_1
(n + 2)(n + 1)


(8.7.7)


﻿


Section 8.7                         Power Series Solutions                        5

for n = 1, 2, 3, .... As in the previous example, these equations do not restrict the values
of ao and a1. However, after specifying ao and a1 by the initial conditions x(0) = ao and
f(0) = ai, we may compute
                                 a2 =0,
                                        ao ao
                                      (3)(2)    6 '
                                        al     al
                                      (4)(3)   12'
                                        a2
                                    as 0,
                                      (5)(4)
                                        a3      ao
                                      (6)(5)   180'
                                        a4      al
                                 ar~
                                      (7)(6)   504'

and so on for as many terms as are desired. We then have

               x =ao+ait+aot+ a1t4 + a0 t6+ a1 +t7+--
                              6      12     180     504
                      S(t        t6      )(t        t4    t
                 =ao 1+-6+       -8+--- +ait+-2+           0+--- .


To find the interval of convergence for x, we look at the two series on the right individually.
Applying the ratio test to the first series, and making use of the difference equation (8.7.7)
to find a3n+3 in terms of a3n, we have, for any value of t,

                                       a3n
        .    a3n+3tn+3 .+        (3n + 3)(3n + 2)   3              It03
   p = him               = him|t| = hm                                       = 0.
       n-o     aantan      n-oClan                      n-o(3n2 + 3)(3n2 + 2)


Hence p < 1 for all t and the series converges on (-oc, o). Similarly, for the second series
we have, for any value of t,

                                      a3n+1
             a3n+4tn+4      .    (3n3+24)(3n + 3)   3    .    _It|3
   p =lhm                =__ _       +hm(r    3   t| = him=0
               n-  a+tn1   no         aan+1             n-oo (3nt + 4)(3nt + 3)


Again, p < 1 for all t and this series also converges on (-o, o). Thus we have found a
solution for (8.7.6) which is analytic on (-oo, oo).

    The computation of the interval of convergence of a solution found in the manner of the
last example can be very involved. Although the justification of the following proposition
is itself too involved for us to go into at this point, we will make use of it in our final two
examples.


﻿


6


Power Series Solutions


Section 8.7


Proposition Suppose p(t) and q(t) are analytic on the interval (to - R, to + R). Then for
any two constants ao and ai, there is a unique function x(t), analytic on (to - R, to + R),
which satisfies the differential equation


2 + p(t)h + q(t)x = 0


(8.7.8)


with initial conditions x(to) = ao and 4(to) = a1.

    In our previous example, we have, in the notation of the proposition, p(t) = 0 and
q(t) = -t, both of which are analytic on (-oc, oc). Hence it follows from the proposition,
as we saw by direct computation, that our power series solution converges on (-oc, oo).
    Note that this proposition also tells us that analytic solutions to an equation of the
form (8.7.8) will exist provided p and q are both analytic. Equation (8.7.8) is similar to
the equations we studied in Section 8.4, the difference being that (8.7.8) does not require
the coefficients of ch and x to be constants.

Example Consider the equation


(I - t),;


(8.7.9)


Suppose x is analytic on an interval about t = 0 and write

                                         00
                                    x =     ant"
                                        n=0

for some constants ao, a1, a2, .... Then, as before,


00
nant
n=1


-1


   (n + 1)an+1t"
n=0


and


X       n(n
    n=2


            00
1)ntn--2 =    (rn + 2)(n + 1)an+2t".
           n=0


Substituting into (8.7.9) gives us

                           00                        00
                    (1 - t)   (n + 2)(n + 1)an+2t" +    a3nt" = 0.
                           n=0                      n=0

Expanding the first term, we have


1201~


   (n + 2)(n + 1)an+2tn+1 +  ante= 0.
n=0                         n=0


﻿


Section 8.7


Power Series Solutions


7


To adjust for the fact that the powers of t begin with 1 in the middle series, but with 0
for the other series, we move the constant terms of the latter series out of the summation
and adjust the index of the middle series to obtain


       00
2a2 +    (n + 2)(n + 1)an+2t"
      n=1i


00                        00
   (n + 1)nan+1t" + a0 +Z  ant"
n=1                      n=1


0,


from which we obtain

            ao + 2a2 +     ((n + 2)(n + 1)an+2 - (n + 1)nan+1 + an)t" = 0.
                       n=1

Using the uniqueness of power series representations, we conclude that all the coefficients
on the left-hand side of this equation must be 0. Hence

                                     ao + 2a2 = 0

and, for n = 1, 2, 3, . ..,

                     (n + 2)(n + 1)an+2 - (n + 1)nan+1 + an = 0.


Thus
                                             a0
                                      a2= -_

and
                                      (n + 1)nan+1 - a
                                        (n + 2)(n + 1)
for n = 1,2,3,.... Since (8.7.11) becomes (8.7.10) when n
into a single difference equation,


                 (8.7.10)


                 (8.7.11)

0, we may combine them


        (n + 1)nan+1 - an
an+2 =n
          (n + 2)(n + 1)


(8.7.12)


for n = 0, 1, 2,.... As always, ao and a1 are determined by the initial conditions and a2,
a3, a4, ... may be computed from (8.7.12). For example,


       ao
a2= --,
     2a2 - a1
a3     (3)(2)


     (3)(2)a3 - a2
a4 =     (4)(3)


ao + a1
   6   '
                ao
   -(ao + ai)+ 2
          12


ao + 2a1
   24


and


   a (4)(3)a4 - a3
a5 =     (5)(4)


  1             1
-  (ao + 2ai) + -(ao + ai)

            20


2ao + 5a1
   120


﻿


8


Power Series Solutions


Section 8.7


Hence


x =ao+a1t

  =ao 1-


aot2 _ (ao + ai) 3
2          6


   (ao + 2a1) 4
            t
      24 t

--- +ai t--


(2ao + 5a1)
    120
  t  - t5
  12   24  - -


24


t5


Finally, if we rewrite (8.7.9) as
                                        1
                                    +1 -to=0,
then, in the notation of the previous proposition,

                                      p(t) = 0


and
                                            1
                                    q(t ) = .
                                          1 - t
Now p is analytic on (-oc, oc), but, considering intervals about 0, q is analytic on only
(-1, 1). Thus the proposition guarantees only that our solution will be analytic on (-1, 1).
That is, we know that the two power series in the expression for x converge at least on
(-1, 1).

Example For an example involving an unspecified parameter, consider the equation


- 2tc + 2rxz= 0,


(8.7.13)


where r is a constant. Known as Hermite's equation, the solutions to this equation are
important in certain areas of mathematics and quantum mechanics. As usual, we suppose
x is analytic on an interval about t = 0, write
                                        00
                                    x =    ant"
                                        n=0

for some constants ao, a1, a2, ... , and compute


   00
=     nant--
  n=1


   (n + 1)an+1t"
n=0


and


X = n(n
    n=2


            00
1)antn--2     (rn + 2)(n + 1)an+2t".
           n=0


Substituting into (8.7.13), we have

                   0000                                     00
              (n + 2)(n + 1)an+2t+ - 2t   (n+ + )an+1t" + 2r   antra= 0.
           n=0                        n=0                   n=0


﻿


Section 8.7


Sectin 8.7Power Series Solutions9


9


Thus


    (n+ 2) (n + 1)aln±2tnh
n=~0


  Z2(nt + 1)am+itmnl+ 200 t =0
n=~0                  n=0


Adjusting all these series to start with t raised to the first power gives us


      00
2a2 + Z(r + 2)(nt + 1)an±2tnh
      n2~i


00                  00
  Z2rtantn + 2ra0 +   2r'antn
n=~1                1


=0.


Hence


             00
2ra0 + 2a2 +   (( + 2) (n + 1)an ±2 + 2(r - rt)an)tn = 0.
            1


Therefore, by the uniqueness of power series representations, we must have

                                  2ra0 + 2a2 =0


and, for n =1, 2, 3, . ..


(nt + 2)(nt + 1)an+2 + 2(r - rt)an - 0.


Thus


a2 - -ra0


(8.7.14)


and
                                     _2(r - na
                                     _-(n + 2)(n. + 1)                    (..5

for nr= 1, 2, 3,..Since (8.7.15) becomes (8.7.14) when nr= 0, we see that, after a0 and
al, the coefficients of the solution are determined by the difference equation


                                an+2   2(r -   a                          (716
                                      (nt + 2)(n + 1)'                    (..6

n = 0, 1, 2,..For example, we have


a2 =-ro
       2(r - 1)ai
  a3     (3)(2)

  a4 -2(r - 2)a2
  a4     (4)(3)
       2(r - 3)a3
  a5     (5)(4)
       2(r - 4)a4
  a6     (6)(5)


  2(r - 1)ai
     3!
22r(r - 2)ao
     4!
22 (r - 1)(r - 3)ai
       5!
  23r(r - 2)(r - 4)ao
         6!


﻿


10


Power Series Solutions


Section 8.7


and


       2(r - 5)a5
a7       (7)(6)


23(r - 1)(r - 3)(r - 5)ai
           7!


Thus


x =ao + a1t


2 _ 2(r - 1)a1i   22r(r - 2)ao 4   22(r - 1)(r - 3)a1 5
        3!             4!                  5!


23r(r - 2)(r - 4)ao 6  23(r- 1)(r -3)(r - 5)a1  -
        6!                        7!


ao (1


  2   22r(r-2)
rt2+            t
          4!


23r(r - 2)(r - 4)
                t6 _g. ..1
       6!l


+ a1 (t


2(r3- 1) t+22(r -1)(r - 3) 5 _ 23(r - 1)(r -3)(r - 5) 7
   3!              5!                     7!


In the notation of the previous proposition, we have p(t) = 2t and q(t) = 2r, both of which
are analytic on (-oc, oo). Hence it follows that the two series in our solution converge for
all values of t.
    Moreover, note that if we let

                                22r1(r -2)4  23r(r - 2)(r - 4) 6
                                    4!-               6!


and


        3!2(r - 1) t5 + 22(-)(r - 3)t
X2t -t-    3!              5!      t


23(r - 1)(r - 3)(r - 5) t7
          7!


so that
                               x(t) = aoxi(t) + aix2(t),

then x1 is a polynomial when r is a nonnegative even integer and x2 is a polynomial when
r is a positive odd integer. That is, when r is a nonnegative integer, Hermite's equation
will have a polynomial solution. When suitably normalized, as described in Problem 6
below, these polynomials are called Hermite polynomials.

    Our final example shows the strength of the power series method of solving differential
equations. Through one computation we have found analytic solutions to an entire family
of equations parametrized by the real number r. As an added consequence, we have
discovered that the equation has polynomial solutions for certain values of the parameter
r. If we were only interested in numerical values of a solution of Hermite's equation for
one value of r and one set of initial conditions, then using a numerical method, such as
the Runge-Kutta method of Section 8.6, would be the proper approach; however, we can
see that the power series approach leads to a much richer understanding of the solutions
to the general form of the equation.


﻿


Section 8.7


Power Series Solutions


11


Problems

1. Solve the following first order differential equations using power series with the initial
    condition x(0) = ao. Verify your answer by finding a closed form solution for the
    equation using the techniques of Sections 8.2 and 8.3
    (a) ch  3x                                 (b) i =2tz
    (c) i =X -1                                (d) ch   -X
 2. Solve the following second order differential equations using power series with the
    initial conditions x(0) = ao and c(0) = a1. Write the solution out through the first six
    nonzero terms and give an interval of convergence for each solution.
    (a)  ++tz=0                                (b)   +c-tz=0
    (c) 2+ ti-+ X=0                            (d)    -1     2z=
    (e) (1 -t2)z- 2ti- x =0(f) (1+t)z - x =
 3. (a) Use power series to show that the solution of


        satisfying x(0) = ao and c(0) = a1 is given by x = ao cosh(t) + ai sinh(t).
    (b) Solve the equation in (a) using the techniques of Section 8.4 and show that your
        answer agrees with the answer in (a).
 4. Use the ratio test to verify that the solutions x1 and x2 of Hermite's equation found
    in the last example of this section converge for all t in (-oc, oo).
 5. Find polynomial solutions of Hermite's equation for r = 0, r = 1, r = 2, r = 3, r = 4,
    and r =5.
 6. A polynomial solution of Hermite's equation with highest degree term of the form 24t"
    is called a Hermite polynomial and is denoted H (t).
    (a) Show that Ho(t) = 1, H1(t) = 2t, H2(t) = 4t2 - 2, and H3(t) = 8t3 - 12t.
    (b) Find H4(t) and H5(t).
 7. The equation
                              (1 -It2)z - 2th+ r(r + 1)w =0,
    where r is a constant, is known as Legendre's equation.
    (a) Show that the general solution to Legendre's equation may be written as

                                    x(t) =aox1(t) + aix2(t),

        where
                                r(r +1) 2    r(r -2)(r +1)(r +3) ~4
                                   2!      +4!
                                 r (r -2)(r -4)(r +1)(r +3)(r +5) 6

                                                 6!


﻿


12


Power Series Solutions


Section 8.7


                            (r - 1)(r + 2)t3  (r- 1)(r -3)(r + 2)(r + 4)t5
                                 3!                       5!
                            (r - 1)(r - 3)(r - 5)(r + 2)(r + 4)(r + 6) 7
                                               7!
       and ao and ai are constants.
   (b) Explain why the radius of convergence of each of these series is at least 1.
   (c) Note that if r is a nonnegative even integer, then x1 is a polynomial, and if r is a
       positive odd integer, then x2 is a polynomial. If r is an even nonnegative integer,
       let
                                       Pr (t) X
                                               X1(1)
       and if r is a positive odd integer let

                                               x2(t)
                                       Pr (t) =2.
                                               X2(1)

       Then Pr (t), r = 0, 1, 2, ... , is a polynomial solution of Legendre's equation, known
       as a Legendre polynomial, normalized so that Pr(1) = 1. Find Po(t), P1(t), P2(t),
       P3(t), P4(t), and P5(t) and plot their graphs on the interval [-1, 1].
8. Discuss all the interconnections we have seen between difference equations and differ-
   ential equations.