## Description

1 Introduction

The root finding problem is the task of finding one or more values for a variable x so that an equation

f(x) = 0

is satisfied. Denote the desired solution as x

∗

. As a computational problem, we are interested in effective and

efficient ways of finding a value x that is “close enough” to x

∗

. There are two common ways to measuring

“close enough.”

1. The “residual error,” |f(x)|, is small, or

2. The “approximation error” or “true error,” |x − x

∗

| is small.

Of course, the true error is not known, so it must be approximated in some way. In this lab we will look

at some nonlinear equations and some simple methods of finding an approximate solution. We will use an

estimate of the true error when it is readily available and will use the residual error the rest of the time.

2 Programming style

In our discussions, we will often present samples of Python codes. There are many ways to write a correct

program; a programming style aims to write a program that is not only correct, but readable, and more

easily debugged, documented, and modified. We encourage you to write programs in such a way that it is

easy to see the underlying mathematical or computational thinking.

Some of the style rules followed in these labs include:

• One statement per line;

• Loop counters and other insignificant variables have short names, such as i, k, m or n.

• Significant variables are named with long names.

• The elements of a formula can be separated by spaces to improve readability;

• The interior blocks of loops and if-tests must be indented.

• Functions have comments following the initial declaration.

Clarity is important for debugging because if code is harder to read then it is harder to debug. Debugging

is one of the most difficult and least rewarding activities you will engage in, so anything that simplifies

debugging pays off. Clarity is also important when others read your code. Since we need to read your code

in order to give you a grade, please format your code similarly to that in the labs.

3 A Sample Problem

Suppose we want to know if there is a solution to the equation:

cos x = x ,

1

This is not a polynomial or algebraic equation, and there is little hope of forming an explicit expression for

the solution.

Since we can’t solve this equation exactly, it’s worth knowing whether there is anything we can say about

it. In fact, if there is a solution x0 to the equation, we can probably find a computer representable number

x which approximately satisfies the equation (has low residual error) and which is close to x0 (has a low

approximation error). The following exercises use a plot to illustrate the existance of a solution.

We will make plots by constructing a pair of vectors of x-values and corresponding y-values and then

plotting the points with lines connecting them.

4 Exercise 1

The following steps show how to plot the functions y = cos x and y = x together in the same graph from

−π to π.

1. Import numpy and call it np;

2. Import matplotlib and call it plt;

3. Define a vector of x values with the appropriate range;

4. Define vectors y1 and y2 by evaluating functions at x;

5. Use the plot() command twice, for (x,y1) and (x,y2).

6. Use the show() command to make the plot appear;

7. Save a copy of the graph using the savefig() command.

A basic version of your Python script might look like this:

import m a t pl o tli b . p y pl o t a s p l t

import numpy a s np

x = np . l i n s p a c e ( −np . pi , np . pi , 100 )

y1 = x . copy ( ) # Python r e q u i r e s t h i s , i n s t e a d o f ” y1 = x ”

y2 = np . c o s ( x )

p l t . pl o t ( x , y1 )

p l t . pl o t ( x , y2 )

p l t . s a v e f i g ( ’ e x e r c i s e 1 . jp g ’ ) # O p t i on al command . Saves a copy o f t h e p l o t in a f i l e .

p l t . show ( )

You might save your script as exercise1.py. On a computer running Linux, I can execute it by the

command

python3 e x e r c i s e 1 . py

When you see the plot, notice that the two lines intersect. You can even estimate the value of x where this

occurs.

As a report for this exercise, state your estimated value of the root, and include a copy of your plot.

5 Exercise 2

Let’s reorganize our investigation. Instead of solving the equation cos(x) = x, we will construct a function

f(x) = cos(x) − x, and think of the equivalent problem of finding a root x such that f(x) = 0.

We begin by packaging f(x) as an actual Python function. Recall the structure of a Python function:

2

1. A declaration statement of the form:

def function name ( list of arguments ):

2. indented computational statements

3. indented a return statement that reports the function value;

Your Python function might be stored as cosmx.py, and have the form:

def cosmx ( x ) :

””” v al u e = cosmx ( x )

In pu t :

x : t h e argument

Output :

v al u e : t h e v al u e o f cos ( x )−x .

”””

import numpy a s np

v al u e = np . c o s ( x ) − x

return v al u e

Let’s test your function at x = 0.5. Because it’s a function, not a script, we can’t execute it by a command

like “python3 cosmx” or “python3 cosmx(0.5)”. Instead, we have to enter an interactive Python session,

import the function, call it, and then quit. Here’s how I do it on an Ubuntu system:

python3

>>from cosmx import cosmx

>>cosmx ( 0 . 5 )

>>q ui t ( )

Your result should be about 0.37758.

Now let’s use graphics again, but this time, we will plot y1 = cosmx(x) and y2 = 0.

1. From your previous script exercise1.py, make a fresh copy exercise2.py;

2. To access your function, you will need to insert the statement

from cosmx import cosmx

3. To create the y2 data, you can’t simply write y2=0, which creates a single zero value. Instead, use the

command y2 = np.zeros ( len ( x ) ) to create a zero vector that is the same length as x.

4. After your plot statements, you should add the statement plt.grid ( True ), so that we will see

helpful grid lines.

5. Make sure you modify your savefig() statement too!

To execute the program, I type

python3 e x e r c i s e 2 . py

Does the value of x at the point where the curve crosses the x-axis agree with the value you saw in the

previous exercise?

Send me the plot file as exercise2.jpg.

6 The Bisection Idea

The idea of the bisection method is very simple. We assume that we are given two values x = a and x = b

(a < b), and that the function f(x) is positive at one of these values (say at a) and negative at the other.

When this occurs, we say that [a, b] is a “change-of-sign interval” for the function. Assuming f is continuous,

we know that there must be at least one root in the interval.

3

Intuitively speaking, if you divide a change-of-sign interval in half, one or the other half must end up

being a change-of-sign interval, and it is only half as long. To figure out which half contains the root, simply

evaluate the function at the break point. Keep the half interval whose endpoint has a function value of

opposite sign to the midpoint function value. If you repeatedly chop the interval in half this way, you can

decrease the size of the change-of-sign interval below any specified tolerance.

To start thinking of the program we must write, consider the point x = (a + b)/2. If f(x) = 0 we are

done. (This is pretty unlikely.) Otherwise, depending on the sign of f(x), we know that the root lies in

[a, x] or [x, b]. In any case, our change-of-sign interval is now half as large as before. Depending on how the

interval split, replace the value of a or b by the value of x. Repeat this process with the new change of sign

interval [a, b] until it is sufficiently small.

We are guaranteed to converge. We can even compute the maximum number of steps this could take. If

the original change-of-sign interval has length ` then after one bisection the current change-of-sign interval

is of length `/2, and each subsequent step again halves the interval size. We know in advance how well we

will approximate the root x0. These are very powerful facts, which make bisection a robust algorithm—that

is, it is very hard to defeat it.

7 Exercise 3

1. If we know the start points a and b and the interval size tolerance , we can predict beforehand the

number of steps required to reach the specified accuracy. The bisection method will always find the

root in that number or fewer steps. What is the formula for that number?

2. Give an example of a continuous function which has only one root in the interior of the interval [-2, 1],

but for which our bisection method could not be used.

8 Programming Bisection

Before we look at a sample bisection program, let’s discuss some programming issues. If you haven’t done

much programming before, this is a good time to try to understand the logic behind how we choose variables

to set up, what names to give them, and how to control the calculation.

We plan to write the bisection algorithm as a function. This separates the algorithm from the data it

will work on. It allows us to concentrate on getting the algorithm right. And once we have gotten it right,

it will be encapsulated as a function, and can be used over and over on different problems.

Another reasonable recommendation for using functions is that by breaking up a calculation into independent portions, we can hide the algorithm details and temporary variables. The names and values of

variables used within the function will disappear once the function is complete, except for results reported by

the return statement. This reduces the chance that we will accidentally use the same name in two different

parts of a single huge program, which can cause confusion and error.

The bisection algorithm can be expressed in the following way:

Given a function f : R → R and a, b ∈ R so that f(a) · f(b) < 0, construct sequences a

(k) and b

(k)

for

k = 0, . . . with f(a

(k)

)· f(b

(k)

) < 0 for each k by starting out with a

(1) = a, b

(1) = b and then, for each k > 0,

1. Set x

(k) = (a

(k) + b

(k)

)/2.

2. If |x

k − b

(k)

| is small enough, or if f(x

(k)

) = 0 exit the algorithm.

3. If f(x

(k)

) · f(a

(k)

) < 0, then set a

(k+1) = a

(k) and b

(k+1) = x

(k)

.

4. If f(x

(k)

) · f(b

(k)

) ≤ 0, then set a

(k+1) = x

(k) and b

(k+1) = b

(k)

.

5. Return to step 1.

4

Mathematically, it makes sense to think of the sequences a

k and b

k

. But computationally, we really only

need the current values of the left and right endpoints of our interval. Keeping the entire sequence just

clutters up our program with extra indices and data. So if we look at bisection computationally, we get

something like the following:

Algorithm Bisect(f,a,b,)

1. Set x:=(a + b)/2.

2. If |b − a| ≤ , or if f(x) happens to be exactly zero, then exit the function, returning the value x;

3. If sign(f(b)) * sign(f(x)) < 0, then a:=x; otherwise, b:=x.

4. Return to step 1.

The second version of the algorithm is expressed in a form that is easily translated into a loop, with an

explicit exit condition. To check the sign condition, the algorithm employs a sign() function rather than

looking at the value of f(a)∗f(b). This is a better strategy because of roundoff errors if very small quantities

are involved. In Python, there is a smallest positive real number, which we might nickname as “tiny”. Its

value can be returned by the Python value numpy.finfo(float).tiny. If f(a) > 0 and f(b) > 0 are each

less than √

tiny, then their product is zero, not positive.

The language of this description, an example of “pseudocode,” is based on a computer language called

Algol but is intended merely to be a clear means of specifying algorithms in books and papers. The term

“pseudocode” is used for any combination of a computer coding language with natural language. One

objective of this lab is to show how to convert pseudocode into a workable Python program. As a starting

point, let’s fix f(x) to be the function cosmx that you just wrote. Later you will learn how to add it to the

calling sequence. Let’s also fix = 10−10

.

Here is a function that carries out the bisection algorithm for our cosmx() function.

def bi s e c t c o sm x ( a , b ) :

”””x , i tC o un t = b i s e c t c o sm x ( a , b )

b i s e c t c o sm x ( ) u se s b i s e c t i o n f o r a r o o t o f cosmx ( x ) = 0 .

In pu t :

a , b : l e f t and r i g h t ends o f a change o f s i g n i n t e r v a l .

Output :

x : t h e e s t im a t e d r o o t .

i tC o un t : t h e number o f b i s e c t i o n s .

”””

from cosmx import cosmx

import numpy a s np

#

# I n i t i a l i z a t i o n .

#

EPSILON = 1. 0 e −10;

f a = cosmx ( a )

f b = cosmx ( b )

i tC oun t = 0

#

# Loop

#

while ( True ) :

x = ( a + b ) / 2. 0

f x = cosmx ( x )

print ( itCount , ’ : ’ , \

’ f ( ’ , a , ’ ) = ’ , fa , \

5

’ f ( ’ , x , ’ ) = ’ , fx , \

’ f ( ’ , b , ’ ) = ’ , f b )

i f ( f x == 0 ) :

return x , i tC oun t

e l i f ( abs ( b − a ) < EPSILON ) :

return x , i tC oun t

i tC oun t = i tC oun t + 1

i f ( np . si g n ( f a ) ∗ np . si g n ( f x ) <= 0 ) :

b = x

f b = f x

e l s e :

a = x

f a = f x

Note that in the expression fx == 0 a double equals sign is used. In Python, a single equal sign indicates

assignment, which might better have been written as <=. It transfers a value. Whenever we are asking

whether two items are equal, we need to use the double equal sign instead. This is a tricky distinction, and

if you make the wrong choice, sometimes the compiler will complain, but sometimes it will simply run and

compute nonsense.

Note the text at the beginning of the function, which is delimited by triple quotes. In Python, this

is known as a docstring. Because the function has a docstring, a user can ask for information about it

by typing help ( bisect cosmx ). You are encouraged to write a docstring for your Python functions.

However, Python is a little fussy about the format of a docstring:

• It must be the first item following the function statement;

• It must begin and end with triple double quotes;

• The first line should summarize the function;

• If there is more than one line, the second should be blank, and then you should describe the input and

output variables, and any other information a user might find helpful.

9 Exercise 4

Create the file bisect cosmx.py by typing it in yourself, or by using copy-and-paste, and then be prepared

to run it to answer some of the following questions.

1. In your own words, what does the np.sign(x) function do? What if x is 0?

2. The print() command is used only to monitor progress. Note the use of \ as the continuation

character, which allows the print() statement to extend beyond a single line.

3. What would the final value of itCount be if a and b are already closer together than the tolerance

when the function starts?

4. Try the command:

x , i tC oun t = bi s e c t c o sm x ( 0 . 0 , 3. 0 )

How many iterations were required? Is this value smaller than the value from your formula in Exercise

3?

5. Why did I use abs(b-a) in the convergence test instead of simply writing b-a?

6. The program may seem to be working, but try this command:

x , i tC oun t = bi s e c t c o sm x ( 0 . 0 , 0. 5 )

6

What do you notice about the result? Apparently, in this case, the input values are not be suitable for

the bisection method. Rather than carry out a worthless computation, the function should check the

input and refuse to process it if it is illegal. Such a check can be made by code like this:

i f ( c o n di ti o n ) :

ra is e Excep ti on ( ”The i n p u t i s i l l e g a l ! ” )

Replace condition by the appropriate computational check. Perhaps you should also replace the error

message by a more understandable message. Rerun your program and verify that it will correctly

process the input (0, 3) and reject (0, 0.5)

10 Functions as Input

Now we have written the file bisect cosmx.py that can find a root of f(x)=cos(x)-x, but it can’t handle

any other function. Just as we are allowed to vary the endpoints a and b as input quantities, we’d like to be

able to be able to specify as an input quantity the function whose root we are seeking.

11 Exercise 5

Copy the file bisect cosmx.py to a new file named bisect.py.

1. In the declaration line for bisect(), add one more input argument. We don’t know what the user

calls the function, but we have to make up a name for it in the argument list, so we’ll just call it f.

2. If you used a docstring to list your input arguments, include a statement about your new input quantity

f.

3. The import cosmx statement is no longer appropriate. Remove it.

4. There are three places where we evaluate the function with a statement like fa = cosmx ( a ) In

each such place, replace cosmx() by f().

Our first test of the new code simply repeats the analysis of cosmx(). However, now, we have to import

that function during the Python session, and then include it on the argument list to bisect(). On my

system, this might look like:

python3

>>from cosmx import cosmx

>>from b i s e c t import b i s e c t

>>x , i tC oun t = b i s e c t ( 0 , 3 , cosmx )

>> q ui t ( )

For our second test, we need a new function. Using the model of cosmx.py, write a Python function called

poly.py that evaluates the function: f(x) = (x − 5)(32x − 7)(x + 1). Call bisect() using the initial change

of sign interval [0, 1]. We have already seen that we can predict the maximum number of steps necessary to

shrink the interval below the limit of EPSILON. However, for this example, the program may stop early, and

successfully. Can you explain?

12 Exercise 6

The following table contains formulas and search intervals for several functions. Write function files, f1.py

through f5.py, and use bisect() to find a root in the given interval.

Name Formula Interval approxRoot No. Steps

f1 x^2-9 [0,5] _________ _________

7

f2 x^5-x-1 [1,2] _________ _________

f3 x*exp(-x) [-1,2] _________ _________

f4 2*cos(3*x)-exp(x) [0,6] _________ _________

f5 (x-1)^5 [0,3] _________ _________

13 The secant method

The bisection algorithm allows the user to predict in advance the number of iterations necessary to reach

the interval tolerance. For root-finding algorithms that do not simply shrink the interval, it can be harder

to determine whether the computation should be continued. One common, but not always reliable, measure

is the magnitude of the residual, which we might compare to a tolerance as in |f(x)| ≤ ). Our next root

finder algorithm, the secant method, will use a test like this for convergence.

Just like the bisection method, the secant method starts out with two points a and b. It does not require

that these points constitute a change of sign interval. The algorithm simply evaluates the function at those

two values, constructs the straight line though (a,f(a)) and (b,f(b)) and sets c to the argument where

the straight line crosses the axis. The formula is simply:

c =

f(b)a − f(a)b

f(b) − f(a)

If f() was actually a linear function, this would be the root. In general, it’s not going to be the root. We

can’t even be sure that f(c) is smaller than f(a) or f(b). Nonetheless, we now discard a. We use the pair of

values b and c to construct a new linear function, and we find its zero at d.

We can continue this process indefinitely, but of course we hope that the sequence of values is converging

to a root, and we want to know whether we are getting close, or whether we need to cancel the process

because we have taken too many steps, or the function values are blowing up.

The secant method can be described in the following manner.

Algorithm Secant(f,a,b,)

1. Set

x =

f(b) a − f(a) b

f(b) − f(a)

2. If |f(x)| ≤ , return x as the approximate root.

3. Replace a with b and b with x.

4. Go back to step 1.

When the secant method converges, it is usually much faster than bisection. But if the initial values

are far from the root, the method can converge very slowly, or even diverge to infinity. Unlike bisection,

the secant method can be generalized to two or more dimensions, and the generalization is usually called

Broyden’s method.

14 Exercise 7

Write a Python function named secant.py. You might start from the text of bisect.py. The function should

begin as follows:

def s e c a n t ( a , b , f ) :

””” s e c a n t ( a , b , f ) a p p l i e s t h e s e c a n t method f o r a r o o t o f f ( x )

???

8

”””

# I n i t i a l i z e

? ? ? Se t fa , f b ? ? ?

EPSILON = 1. 0 e−10

ITMAX = 1000

# Loop

fo r i tC oun t in range ( 0 , ITMAX ) :

? ? ? Determine x , e v al u a t e f ( x )

i f ( c o n di ti o n )

return x , i tC oun t

? ? ? a r e pl a c e d by b , f a r e pl a c e d by fb ,

? ? ? b r e pl a c e d by x , and f b r e pl a c e d by f x

# E x i t e d from l o o p w i t h o u t c onvergence

ra is e Excep ti on ( ” s e c a n t : No c o n v e r g e n c e a f t e r maximum number o f i t e r a t i o n s . ” )

You will need to replace the sections marked ??? by the appropriate Python statements. Notice that our

loop section now uses a Python for statement, set up to run no more than 1000 iterations. If we don’t leave

the loop with convergence, then we trigger a final error message.

Compare the bisection and secant methods by filling in the following table.

Bisection Bisection Secant Secant

Name Formula [a,b] approxRoot No. Steps approxRoot No. Steps

f1 x^2-9 [0,5] _________ _________ _________ _________

f2 x^5-x-1 [1,2] _________ _________ _________ _________

f3 x*exp(-x) [-1,2] _________ _________ _________ _________

f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________

f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________

f5 (x-1)^5 [0,3] _________ _________ _________ _________

In the above table, you can see that the secant method can be either faster or slower than bisection. You

may also observe convergence failures: either convergence to a value that is not near a root or convergence

to a value substantially less accurate than expected. Regarding the bisection roots as accurate, are there

any examples of convergence to a value that is not near a root in the table? If so, which? Are there any

examples of inaccurate roots? If so, which?

15 The Regula Falsi method

The regula falsi method is a sort of hybrid of bisection and the secant method. We require that the user

supplies a starting change-of-sign interval. Each step of the method reduces the size of the change-of-sign

interval, so the iterates should not be able to diverge. Regula falsi can be faster than bisection, and “safer”

than the secant method; however it is possible to find functions for which regula falsi does worse than either

of the other two methods.

Assuming we begin with a change of sign interval [a, b], the algorithm can be described as follows:

Algorithm Regula(f,a,b,)

9

1. Set

x =

f(b) ∗ a − f(a) ∗ b

f(b) − f(a)

2. If |f(x)| ≤ , return x.

3. If f(x) has the same sign as f(a), replace a by x, otherwise replace b by x.

4. Return to step 1.

16 Exercise 8

Starting from your bisect.py file, write a Python file named regula.py to carry out the regula falsi algorithm.

Fill in the following table, to compare Bisection, Secant, and Regula Falsi.

Bisection Secant Regula Falsi

Name Formula Interval approxRoot No. Steps No. Steps No. Steps

f1 x^2-9 [0,5] _________ _________ _________ _________

f2 x^5-x-1 [1,2] _________ _________ _________ _________

f3 x*exp(-x) [-1,2] _________ _________ _________ _________

f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________

f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________

f5 (x-1)^5 [0,3] _________ _________ _________ _________

You should observe both faster and slower convergence, compared with bisection and secant. You should

not observe lack of convergence or convergence to an incorrect solution, except for function f5().

Function f5, (x-1)^5 turns out to be very difficult for regula falsi. Loosen your convergence criterion to

a tolerance of EP SILON = 10−6 and increase the maximum allowable number of iterations to ITMAX =

500, 000 and try again.

SECOND CHANCE:

Regula

Name Formula Interval approxRoot No. Steps

f5 (x-1)^5 [0,3] _________ _________

You should observe convergence, but it takes a very large number of iterations.

regula() and bisect() both keep the current iterate, x in a change-of-sign interval. Why would it be

wrong to have the regula() function use the same convergence criterion as bisect() used?

10