1

BASIC CALCULUS REFRESHER

Ismor Fischer, Ph.D. Dept. of Statistics UW-Madison

1. Introduction.

This is a very condensed and simplified version of basic calculus, which is a prerequisite for many courses in Mathematics, Statistics, Engineering, Pharmacy, etc. It is not comprehensive, and absolutely not intended to be a substitute for a one-year freshman course in differential and integral calculus. You are strongly encouraged to do the included Exercises to reinforce the ideas. Important mathematical terms are in boldface; key formulas and concepts are boxed and highlighted(). To view a color .pdf version of this document (recommended), see http://www.stat.wisc.edu/~ifischer.

2. Exponents – Basic Definitions and Properties

For any real number base x, we define powers of x: x0 = 1, x1 = x, x2 = x ⋅ x, x3 = x ⋅ x ⋅ x, etc. (The exception is 00, which is considered indeterminate.) Powers are also called exponents.

Examples: 50 = 1, (−11.2)1 = −11.2, (8.6)2 = 8.6 × 8.6 = 73.96, 103 = 10 × 10 × 10 = 1000,

(−3)4 = (−3) × (−3) × (−3) × (−3) = 81.

Also, we can define fractional exponents in terms of roots, such as x1/2 = x , the square root of x. Similarly, x1/3 = 3x , the cube root of x, x2/3 = (3x), etc. In general, we have 2xm/n = (nx) , i.e., the nth root of x, raised to the mth power.

Examples: 641/2 = 64 = 8, 643/2 = (64)3 = 83 = 512, 641/3 = 364 = 4, 642/3 = (364) = 422 = 16.

Finally, we can define negative exponents: x−r = 1xr . Thus, x−1 = 1x1 , x−2 = 1x2 , x−1/2 = 1x1/2 = 1 x , etc.

Examples: 10−1 = 1101 = 0.1, 7−2 = 172 = 149 , 36−1/2 = 1 36 = 16 , 9−5/2 = 1(9) = 135 = 1243 .

Properties of Exponents

1. xa ⋅ xb = xa+b Examples: x3 ⋅ x2 = x5, x1/2 ⋅ x1/3 = x5/6, x3 ⋅ x−1/2 = x5/2

2. xaxb = xa−b Examples: x5x3 = x2, x3x5 = x−2, x3 x1/2 = x5/2

3. (xa)b = xab Examples: (x3)2 = x6, (x−1/2)7 = x−7/2, (x2/3)5/7 = x10/21

Descartes ~ 1640

2

(0, 7) (2.5, 7) (–1.8, 7)

3. Functions and Their Graphs Input x → → Output y

If a quantity y always depends on another quantity x in such a way that every value of x corresponds to one and only one value of y, then we say that “y is a function of x,” written y = f (x); x is said to be the independent variable, y is the dependent variable. (Example: “Distance traveled per hour (y) is a function of velocity (x).”) For a given function y = f(x), the set of all ordered pairs of (x, y)-values that algebraically satisfy its equation is called the graph of the function, and can be represented geometrically by a collection of points in the XY-plane. (Recall that the XY-plane consists of two perpendicular copies of the real number line – a horizontal X-axis, and a vertical Y-axis – that intersect at a reference point (0, 0) called the origin, and which partition the plane into four disjoint regions called quadrants. Every point P in the plane can be represented by the ordered pair (x, y), where the first value is the x-coordinate – indicating its horizontal position relative to the origin – and the second value is the y-coordinate – indicating its vertical position relative to the origin. Thus, the point P(4, 7) is 4 units to right of, and 7 units up from, the origin.)

Examples: y = f (x) = 7; y = f (x) = 2x + 3; y = f (x) = x2; y = f (x) = x1/2; y = f (x) = x−1; y = f (x) = 2x.

The first three are examples of polynomial functions. (In particular, the first is constant, the second is linear, the third is quadratic.) The last is an exponential function; note that x is an exponent!

Let’s consider these examples, one at a time.

f

• y = f (x) = 7: If x = any value, then y = 7. That is, no matter what value of x is chosen, the value of the height y remains at a constant level of 7. Therefore, all points that satisfy this equation must have the form (x, 7), and thus determine the graph of a horizontal line, 7 units up. A few typical points are plotted in the figure.

Exercise: What would the graph of the equation y = −4 look like? x = −4 ? y = 0 ? x = 0 ?

• y = f (x) = 2x + 3: If x = 0, then y = f (0) = 2(0) + 3 = 3, so the point (0, 3) is on the graph of this function. Likewise, if x = −1.5, then y = f (−1.5) = 2(−1.5) + 3 = 0, so the point (−1.5, 0) is also on the graph of this function. (However, many points, such as (1, 1), do not satisfy the equation, and so do not lie on the graph.) The set of all points (x, y) that do satisfy this linear equation forms the graph of a line in the XY-plane, hence the name.

Exercise: What would the graph of the line y = x look like? y = −x ? The absolute value function y = |x| ?

(0, 3)

3

y = x y = x2

Notice that the line has the generic equation y = f (x) = mx + b, where b is the Y-intercept (in this example, b = +3), and m is the slope of the line (in this example, m = +2). In general, the slope of any line is defined as the ratio of “height change” Δy to “length change” Δx, that is,

m = ΔyΔx = y2 − y1x2 − x1

for any two points (x1, y1) and (x2, y2) that lie on the line. For example, for the two points (0, 3) and (−1.5, 0) on our line, the slope is m = ΔyΔx = 0 − 3−1.5 − 0 = 2, which confirms our observation.

• y = f (x) = x2: This is not the equation of a straight line (because of the “squaring” operation). The set of all points that satisfies this quadratic equation – e.g., (–3, 9), (–2, 4), (–1, 1), (0, 0), (1, 1), (2, 4), (3, 9), etc. – forms a curved parabola in the XY-plane. (In this case, the curve is said to be concave up, i.e., it “holds water.” Similarly, the graph of –x2 is concave down; it “spills water.”)

Exercise: How does this graph differ from y = f(x) = x3 ? x4 ? Find a pattern for y = xn, for n = 1, 2, 3, 4,…

• y = f (x) = x1/2 = x : The “square root” operation is not defined for negative values of x (e.g., –64 does not exist as a real number, since both (+8)2 = +64 and (–8)2 = +64.) Hence the real-valued domain of this function is restricted to x ≥ 0 (i.e., positive values and zero), where the “square root” operation is defined (e.g., +64 = +8). Pictured here is its graph, along with the first-quadrant portion of y = x2 for comparison.

Exercise: How does this graph differ from y = f(x) = x1/3 = 3x ? Hint: What is its domain? (E.g., 3+64 = ??, 3–64 = ???) Graph this function together with y = x3.

The “square” operation x2 and “square root” operation x1/2 = x are examples of inverse functions of one another, for x ≥ 0. That is, the effect of applying of either one, followed immediately by the other, lands you back to where you started from. More precisely, starting with a domain value x, the composition of the function f(x) with its inverse – written f –1(x) – is equal to the initial value x itself. As in the figure above, when graphed together, the two functions exhibit symmetry across the “diagonal” line y = x (not explicitly drawn).

4

• y = f (x) = x−1 = 1x : This is a bit more delicate. Let’s first restrict our attention to positive domain x-values, i.e., x > 0. If x = 1, then y = f (1) = 11 = 1, so the point (1, 1) lies on the graph of this function. Now from here, as x grows larger (e.g., x = 10, 100, 1000, etc.), the values of the height y = 1x ⎝⎜⎛⎠⎟⎞e.g.‚ 110 = 0.1‚ 1100 = 0.01‚ 11000 = 0.001‚ etc. become smaller, although they never actually reach 0. Therefore, as we continue to move to the right, the graph approaches the X-axis as a horizontal asymptote, without ever actually touching it. Moreover, as x gets smaller from the point (1, 1) on the graph (e.g., x = 0.1, 0.01, 0.001, etc.), the values of the height y = 1x ⎝⎜⎛⎠⎟⎞e.g.‚ 10.1 = 10‚ 10.01 = 100‚ 10.001 = 1000‚ etc. become larger. Therefore, as we continue to move to the left, the graph shoots upwards, approaching the Y-axis as a vertical asymptote, without ever actually touching it. (If x = 0, then y becomes infinite (+∞), which is undefined as a real number, so x = 0 is not in the domain of this function.) A similar situation exists for negative domain x-values, i.e., x < 0. This is the graph of a hyperbola, which has two symmetric branches, one in the first quadrant and the other in the third.

Exercise: How does this graph differ from that of y = f (x) = x−2 = 1x2 ? Why? (1, 1)

5

p < 0

NOTE: The preceding examples are special cases of power functions, which have the general form y = x p, for any real value of p, for x > 0. If p > 0, then the graph starts at the origin and continues to rise to infinity. (In particular, if p > 1, then the graph is concave up, such as the parabola y = x2. If p = 1, the graph is the straight line y = x. And if 0 < p < 1, then the graph is concave down, such as the parabola y = x1/2 = x .) However, if p < 0, such as y = x−1 = 1x , or y = x−2 = 1x2 , then the Y-axis acts as a vertical asymptote for the graph, and the X-axis is a horizontal asymptote.

Exercise: Why is y = xx not a power function? Sketch its graph for x > 0.

x2, if x ≤ 1 f (x) = x3 , if x > 1

This graph is the parabola y = x2 up to and including the point (1, 1), then picks up with the curve y = x3 after that. Note that this function is therefore continuous at x = 1, and hence for all real values of x.

x2, if x ≤ 1 g (x) = x3 + 5, if x > 1. This graph is the parabola y = x2 up to and including the point (1, 1), but then abruptly changes over to the curve y = x3 + 5 after that, starting at (1, 6). Therefore, this graph has a break, or “jump discontinuity,” at x= 1. (Think of switching a light from off = 0 to on = 1.) However, since it is continuous before and after that value, g is described as being piecewise continuous.

6

y = e

7

L

saw above that as the values of x grow ever large

r, the values of 1x become ever smaller can’t actually reach 0 exactly, but we can “sneak up like, simply by making x large enough. (For ins500.) In this context, we say that 0 is a limiting e. A mathematically concise way to express this is a

Many other limits are possible, but we now wish to consider a special kind. To motivate this, consider again the parabola example y = f (x) = x2. The average rate of change between the two points P(3, 9) and Q(4, 16) on the graph can be calculated as the slope of the secant line connectingΔy

Δxlim 1 = 16 − 94 − 3 .25) on the graph, closer to P(3, 9). The average rate of change is now msec = ΔyΔx = .5, the slope of the new secant line between P and Q. If we now slide to a new point ill closer to P(3, 9), then the new slope is msec = Δy = 9.61 − 9 = 6.1, and so on.

As Q approaches P, the slopes msec of the secant lines appear to get ever closer to 6 – the slope mtan of the tangent line to the curve y = x2 at the point P(3, 9) – thus measuring the instantaneous rate of change of this function at this point P(3, 9). (The same thing also happens if we approach P from the left side.) We can actually verify this by an exp

any nearby point Q(3 + Δx, (3 + Δx)2) on the graph of y = x2, we have msec = ΔyΔx = (3 + Δx) − 9Δx =

9 + 6 Δx + (Δx)2 − 9 = 6 Δx + (Δx)2

Δx = Δx ⋅ (6 + Δx) = 6 + Δx. (We can check this formula against

th

msec values that we already computed: if Δx = 1, then if x = 0.1, then m

msec = 7 ; if Δx = 0.5, then msec = 6.5 ;

sec = 6.1 .) As Q approaches P – iΔx approaches the quantity mtan = 6 as its limiting vaYP

8

S

on the graph, say at P(4, 16) or P(−5, 25) or even P(0, 0). We can use the same calculation as we did above: the average rate of change of y = x2 between any two generic points P(x, x2) and Q(x + Δx, (x + Δx)2) on its graph, is given by msec = yΔx = (x + x) x2Δx = x + 2x Δx + (Δx) − xΔx = 2x Δx + (Δx)2Δx = Δx ⋅ (2x + Δx)Δx = 2x + Δx. As Q approaches P – i.e., as Δx approaches 0 – this quantityapproaches mtan = 2x “in the limit,” thereby defining the instantaneous rate of change of the function at the point P. (Note that if x = 3, these calculations agree with th

X P(0, 0), mtan P(3, 9), mtan = 6 P(4, 16), mtan = 8 P(−2, 4), mtan = −4 P(−5, 25), mtan = −10

m

In principle, there is nothing that prevents us from applying these same ideas to other functions y = f(x). To find the instantaneous rate of change at an arbitrary point P on its graph, we first calculate the average rate of change between P(x, f(x)) and a nearby point Q(x + Δx, f(x + Δx)) on its graph, as measured by m = ΔyΔx = f(x + Δx) − f(x) Δx . As Q approaches P – i.e., as Δx approaches 0 from both sides – this quantity becomes the instantaneous rate of change at P, defiby: mtan = lim ΔyΔx = lim f(x+ Δx) − f(x) Δx Δx → 0 Δx → 0

dydx – is called the

d (x2)

9

•

Using methods very similar to those above, it is possible to prove that such a general rule exists for any power function x p, not just p = 2. Namely, if y = f(x) = x p, then dydx = f ′(x) = p x p−1, i.e., Power Rule d (x p) = p1 . Examples: If y = x3, then dydx = 3 x2. If y = x1/2, then dydx = 12 x−1/2. If y = x−1, then dydx = − x−2. Also note that if y = x = x1, then dydx = 1 x0 = 1, as it should! (The line y = x has slope m = 1 everywhere.)

A

d

d

E

H

c

th

it

More examples: If y = ex/2, then dydx = 12 ex/2. If y = e−x, then dydx = (−1) e−x = −e−x. Finally, exploiting the fact that exponentials and logarithms are inverses functions, we have for x > 0, Logarithm Rule d [log(x)] = 1 1 ⇒ Special case (b = e): d [ln(x)] = 1

h

If y = f(x) = C (any constant) for all x‚ then the derivative dydx = f ′(x) = 0 for all x. (However, observe that a vertical line, having equation x = C, has an infinite – or undefined – slope.) Not every function has a derivative everywhere! For example, the functions y = f(x) = |x|, x1/3, and xare not differentiable at x = 0, all for different reasons. Although the first two are continuous througthe origin (0, 0), the first has a V-shaped graph; a uniquely defined tangent line does not exist at th“corner.” The second graph has a vertical tangent line therehene the slope s infinite. And as we’vseen, the last function is undefined at the origin; x = 0 is not even in its domain, so any talk of a tangenline there is completely meaningless. But, many complex functions are indeed differentiable…

10

Properties of Derivatives

For any two differentiable functions f(x) and g(x), Sum and Difference Rules

dx dx

Product Rule [ f(x) g(x) ] ′ = f ′(x) g(x) + f(x) g ′(x) Example: If y = x11 e6x, then dydx = (11 x10)(e6x) + (x11)(6 e6x) = (11 + 6x) x10 e6x. Quotient Rule

[ g(x) ] = [g (x)]2 provided g(x) ≠ 0

x7 + dx (7 + 8)2 (x7 + 8)2

5. Chain Rule NOTE: See below for a more detailed explanation. Example: If y = (x2/3 + 2e−9x)6, then dydx = 6 (x2/3 + 2e−9x)5 ( 23 x−1/3 − 18e−9x).

le: If y = e, then dx = e (−2 ) = −x e.

The graph of this function is related to the “bell curve” of probability and statistics. Note that you cannot calculate its derivative by the “exponential rule” given above,

Example: If y = ln(7x10 + 8x6 – 4x + 11), then dydx = 17x10 + 8x6 – 4x + 11 (70x9 + 48x5 – 4). dy 3 1/23 − 30 −3x

Example: If = 5 x then dy 3

Example: If y = 9 x, then dx = 9 (1) = 9 . Example: If y = −3 e 2 , then dx = −3 (2 e) = −6 e.

11

Chain Rule, which can be written several different ways, bears some further explanation. It is a

The

rule for differentiating a composition of two functions f and g, that is, a function of a function y = f ( g(x) ). The function in the first example above can be viewed as composing the “outer” function f(u) = u6, with the “inner” function u = g(x) = x

u s of speed over a given time i al to the ratio ΔBΔC = 4020 = 21, twice as much. Therefore ΔC 20 1 x du dx

12

5.

Applications: Estimation, Roots and Maxima & Minima, Related Rates As seen, for nearby points P and Q on the graph of a function f(x), it follows that msec ≈ mtan, at least informally. That is, for a small change Δx, we have ΔyΔx ≈ dydx = f ′(x), or Δy ≈ f ′(x) Δx. Hence the first derivative f ′(x) can be used in a crude local estimate of the amount of change Δy of the function y = f(x). Example: Let y = f(x) = (x2 – 2 x + 2) ex. The change in function value from say, f(1) to f(1.03), can be estimated by Δy ≈ f ′(x) Δx = (x2 ex) Δx, when x = 1 and Δx = 0.03, i.e., Δy ≈ 0.03 e = 0.0815, to four decimal places.

NOTE: A better estimate of the difference Δy = f(x + Δx) – f(x) can be obtained by adding information from the second derivative f ′′(x) (= derivative of f ′(x)), the third derivative f ′′′(x), etc., to the previous formula: Δy ≈ 1 f ′(x) Δx + 1 f ′′(x) (Δx)2 + 1 f ′′′(x) (Δx)3 + … + 1 f (n)(x) (Δx)n (where 1! = 1, 2! = 2 1 = 2, 3! = 3 n! = n × (n – 1) × (n – 2) × … × 1). The formal mathematical statement of equality between Δy and the sum of higher derivatives of f is an extremely useful result known as Taylor’s Formula. Suppose we wish to solve for the roots of the equation f(x) = 0, i.e., the values where the graph of f(x) intersects the X-axis (also called the zeros of the function f(x)). Algebraically, this can be extremely tedious or even impossible, so we often turn to numerical techniques which yield computer-generated approximations. In the popular Newton-Raphson Method, we start with an initial guess x0, then produce a sequence of values x0, x1, x2, x3,… that converges to a numerical f(x)

f′(x) applying the same formula to itself, as in a continual “feedback loop.”) Example: Solve for x: f(x) = x3 – 21 x2 + 135 x – 220 = 0.

hat f(2) = –26 < 0, and f(3) = +23 > 0. Hence, by continuity, thal must have a zero somewere beeen 2 and 3. Applying Ne

fo

f′(x) =3 x + 135 3 x – 42 generate the sequence x1 = 2.412698, x2 = 2.461290, 3 = 2

rence from 0 is due to roundoff error.)

13

error (between each value and the true solution) in each iteration is approximately squared in the

-coordinate of a new point P1 on the graph, and the cycle repeated until

gebraically formalizing this process results in

a function f(x) has either a relative maximum (i.e., local maximum) or a relative minimum

le there, then its tangent line

ust be horizontal, i.e., slope tan = ( ) = 0. This suggests that, in order to find such relative

(i

Notice the extremely rapid convergence to the root. In fact, it can be shown that the small

next iteration, resulting in a much smaller error. This feature of quadratic convergence is a main reason why this is a favorite method. Why does it work at all? Suppose P0 is a point on the graph of f(x), whose x-coordinate x0 is reasonably close to a root. Generally speaking, the tangent line at P0 will then intersect the X-axis at a value much closer to the root. This value x1 can then be used as the xsome predetermined error tolerance is reached. Al

the general formula given above.

If

(i.e., local minimum) at some value of x, and if f(x) is differentiabmf ′x

m

extrema.e., local extrema), we set the derivative f ′(x) equal to zero, and solve the resulting algebraic equation for the critical values of f, perhaps using a numerical approximation technique like Newton’s Method described above. (But beware: Not all critical vas ecessarily correspond to rela

n

ont’d): Find and classify the critical points of y = f(x) = x3 – 21 x2 + 135 x

We have f ′(x) = 3 x2 – 42 x + 135 = 0. As this is a quadratic (degree 2) polynomial equation, a numerical approximation technique is not necessary. We can use the quadratic formula to solve this explicitly, or simply observe that, via factoring, 3 x2 – 42 x + 135 = 3 (x – 5)(x – 9) = 0.

H

Once obtained, it is necessary to determine the exact nature of these critical points. Consider the first critical value, x = 5, where f ′ = 0. Let us now evaluate the derivative f ′(x) at two nearby values that bracket x = 5 on the left and right, say x = 4 and x = 6. We calculate that: mtan(4) = f ′(4) = +15 > 0, which indicates that the original function f is increasing at x = 4, tan is ne

Hence, as we move from left to right in a local neighborho

demonstrates an application of the “First Derivative Test” for determining the nature of critical points. In an alternate method, the “Second Derivative Test,” we evaluate f ′′(x) = 6 x – 42 at f ′′(5) = –12 < 0, which indicates that the original function f is concave down (“spills”) at this value. Hence, this also shows that the point (5, 55) is a relative maximum for f, consistent with th above

Exercise: Show that: (1) f ′(8) < 0, f ′(10) > 0 First Derivative Test

Finally, notice that f ′′(x) = 6 x – 42 = 0 when x = 7, and in a local neighborhood of that value, f ′′(6) = –6 < 0, which indicates that the original function f is concave down (“spills”) at x = 6, f ′′(7) = 0, which indicates that f is neither concave down nor concave up at x = 7, f ′′(8) = +6 > 0, which indicates that the original function f is concave up (“holds”) at x = 8. Hence, across x = 7, there is chang

(2.461941, 0)

(5, 55) (9, 23) (7, 39)

f decreases f increases f increases The full graph of f(x) = x

is

3 – 21 x2 + 135 x – 220 is shown below, using all of this information.

As we have shown, this function has a relative maximum (i.e., local maximum) value = 55 at = 5, and a relative minimum

x

both higher and lower points on the graph! For example, if x ≥ 11, then f(x) ≥ 55; likewise, if

x

maximum) value, and no absolute minimum (i.e., global minimum) value. However, if we

re

extrema are attained. For instance, in the interval], the relative extreme p

absolute extrem

ial maximum and minimue interval [4, 12], the global m

(

tion is equal to 104, attained at the right endpoint x = 12. Similarly, in t

[0

15

xercise: Graph each of the following.

3 2

xercise: The origin (0, 0) is a critical point for both f(x) = x and f(x) = x . (Why?) Using the

these functions.

3

E

f(x) = x3 – 21 x2 + 135 x – 243

f(x) = x3 – 21 x2 + 135 x – 265

f(x) = x – 21 x + 147 x – 265

34

E

tests above, formally show that it is a relative minimum of the latter, but a point of inflection of the former. Graph both of

The volume V of a spherical cell is functionally related to its radius r via the formula V = 4 π

In

S

n

W

a

X a xx + ΔxzF(x + Δx) − F(x) F(x)

16

We

graph of f in the interval [a, x].

Cle au e and only one area (shown highlighted above in blue),

this

formally define a new function

F(x) = Area under the

arly, becse every value of x results in on is

F(x + Δx) = Area under the graph of f in the interval [a, x + Δx],

a

F(x + Δx) − F(x) = Area under the graph of f in the interval [x, x + Δx] = Area of the rectangle with height f(z) and width Δx (where z is some value in the interval [x, x + Δx]) = f(z) Δx. refore, we have F(x + Δx) − F(x) Δx = f(z).

N

F

Δx → 0, we see that the right hand side f(

F′ (x) = f(x), i.e., F is an antiderivative of f. mally express… F(x) = ⌡⌠a x f(t) dt, e⌡⌠a xf(t) dt represents the definite integral of the function f from af is called the integrand.) More generally, if F is any antiderivative of f, then t two functions are related via the indefinite integral: ⌡⌠f(x) dx = F(x) + C, where C is an arbitrary x) = 110 x10 + C (where C is any constant) is the general antiderivative of f(x) = 110 (10x9) + 0 = x9 = f(x).

W

17

xample 2

E x/8

b

e x/8) + 0 = e x/8 = f(x).

We can write this relation succinctly as ⌡⌠e x/8 dx = 8 e x/8 + C. Integrals possess the analogues of Properties 1 and 2 derivatives,

N

e integral of a constant multiple of a function, c f(x), is equal

m the tegral of the function f(x). Also, the integral of a sum (respectively,

difference) of two functions is equal to the sum (respectively, differe

inrtegral analogue for products corresponds to a technique known as

.) These are extremely important properties for the applications that follo

F

exponential functions can be i

, and

to

r, d mmultiple). To illustrate…

u p + 1 + C, if p ≠ −1 ⌡⌠u

p

du =

ln |u| + C, if p = −1

Properties of Integrals

ncti

∫ [f(x) ± g(x)] dx = ∫ f(x) dx ± ∫ g(x) dx

18

Example 3: ⌡⌠(x5 + 2)9 5x4 dx = 10 + C.

Tu9du u10

10 C⌠

integrand, and integrate the resulting polynomial (of degree 49) term-by-term… Yuk. The second way, as illustrated, is to recognize that if we substitute u = x5 + 2, then du = 5x4 dx, which is precisely tother factor in the integrand, as is! Therefore, in terms of the variable u, this is essentially just a “powrule” integration, carried out above. (To check the answer, take the derivative of the right-hand functioand verify that the original integrand is restored. Don’t forget to use the Chain Rule!) Note that if tconstant multiple 5 were absent from the original integrand, we could introduce and com

(T

uld not be able to carry out the inbalance constant multiples

e, not f u−1du = ln |u| + C ⌡⌠ du = eu + C eu⌡⌠ 13⌡⌠(1 + x3)−1/2 3x2 dx = 13× x

u = 1 + x3, so that du = 3x2 dx. Th

ce,eintegral sign), revealing that this is again a “

have been able to carry out the integration exactly

c

Example 5: ⌡⎮⌠x21 + x3 dx = 13⌡⌠(1 + x3)−1 3x2 dx = 13 ln |1 + x3| + C.

In

m

Again, if the z were missing from the integrand, we would not be able to introduce and balance for it. In fact, it can be shown t

“

u−1/2du = u1/2 ⌡⌠

19

⌠ bf (x) dx

Finally, all these results can be sumariz into on – such as integration by parts, trigonometric substitution, partial fractions, etc. – will not be reviewed here.)

Example 7: ⌡⌠0 1x3 (1 − x4)2 dx This definite integral represents the amount of area under the curve f(x) = x3 (1 – x4)2, from

Method 1. Expand and integrate termwise: ⌡⌠ 1x3 (1 − 2x4 + x8) dx =

=

0

M

in

u-limits: when x = 0, we get u = 1 – 04 = 1; when x = 1, we g

−

x=0

= −14⌡⌠u=1 0 NOTE: Numerical integration techniques, such as the Trapezoidal Rule, are sometimes used also. du u2

20

7. Differential Equations

As a first example, suppose we wish to find a function

dx

y = f(x) whose derivative dy is given, e;g., dy = x2. Formally, by separation of variables – y on the left, and x on the right – we can rewrite this ordinary differential equation

, we can now integrate both sides y = 13 x3 + C, where C is an arbitrary Note that the “solution” therefore esents an entire family of functions; each a different value of C. Further l value (or initial condition), such singles out exactly one of them passing y = 13 x3 + 73. Now consider the case of finding a function y = f(x) dydx is proportional to y itself, i.e., a is a known constant of

ality (either positive or negative).

y

ields ln(y) = a x + C, and solving gives y = e = e ax eC, or y = A e ax, where A is an arbitrary (positive) multiplicative constant. Hence this is a fam

e

u

decreasing for a < 0 (as in radioactive isotope decay). Specifying an initial amount y(0) = y0 “when the clock =

o” yietheuoon y y0 e Finally, suppose that population size = f(x) is restricte

between 0 and 1, such that the rate of change dx is proportional to the product y (1 – y), i.e., dydx = a y (1 – y),

w

solution is given by y = 0 e ax y0 e ax + (1 – y0). This is known s he mbles the

logistic curve,(0, y0)

e

N

OTE: Many types of differential equation exist, including those that cannot be explicitly solved using “elementary” techniques. Like

21

8. ummary of Main Points

S

The instantaneous rate of change of a function y = f(x) aderivative dydx

t a value of x in its domain, is given by its = f′ (x). This function is mathematically deof average rates of change over progressively smaller iinterpreted as the slope of the line tangent to the graph offrom simpler ones by taking sums, differences, products,for computing their derivatives as well. In particular, v have: d dx (u p) = p u p − 1 dudx Gened dx (e u) = e u dudx Gene

dx u dx

General Logarithm Rule Derivatives can be applied to estimate functions locallyrel A function f(x) has an antiderivative F(x) if its derivative dd expressed in terms of an indefinite integral: ⌡⌠ f (x) dx = ⌡⌠u

p

du =

u p+1p + 1 + C, if p ≠ −1 Gene ⌡⌠u−1 du = ln |u| + C Genera ⌡⌠e

u

du =

eu +

C

Gen

al Exponential Rule

The corresponding definite integral ⌡⌠a b f(x) dx = F(b) − Funder the graph of y = f(x) in the interval [a, b], though other interpretaquantities that can be interpreted as definite integrals inamount of work done over a path, average value, probability Derivatives and integrals can be generally be used to anasystems, for example, via differential equations of variouused when explicit solutions are intractable.

Suggestions for improving this document? Sen