Roots of Equations MATH2070 Lab #3 solution




5/5 - (6 votes)

1 Introduction
The root finding problem is the task of finding one or more values for a variable x so that an equation
f(x) = 0
is satisfied. Denote the desired solution as x

. As a computational problem, we are interested in effective and
efficient ways of finding a value x that is “close enough” to x

. There are two common ways to measuring
“close enough.”
1. The “residual error,” |f(x)|, is small, or
2. The “approximation error” or “true error,” |x − x

| is small.
Of course, the true error is not known, so it must be approximated in some way. In this lab we will look
at some nonlinear equations and some simple methods of finding an approximate solution. We will use an
estimate of the true error when it is readily available and will use the residual error the rest of the time.
2 Programming style
In our discussions, we will often present samples of Python codes. There are many ways to write a correct
program; a programming style aims to write a program that is not only correct, but readable, and more
easily debugged, documented, and modified. We encourage you to write programs in such a way that it is
easy to see the underlying mathematical or computational thinking.
Some of the style rules followed in these labs include:
• One statement per line;
• Loop counters and other insignificant variables have short names, such as i, k, m or n.
• Significant variables are named with long names.
• The elements of a formula can be separated by spaces to improve readability;
• The interior blocks of loops and if-tests must be indented.
• Functions have comments following the initial declaration.
Clarity is important for debugging because if code is harder to read then it is harder to debug. Debugging
is one of the most difficult and least rewarding activities you will engage in, so anything that simplifies
debugging pays off. Clarity is also important when others read your code. Since we need to read your code
in order to give you a grade, please format your code similarly to that in the labs.
3 A Sample Problem
Suppose we want to know if there is a solution to the equation:
cos x = x ,
This is not a polynomial or algebraic equation, and there is little hope of forming an explicit expression for
the solution.
Since we can’t solve this equation exactly, it’s worth knowing whether there is anything we can say about
it. In fact, if there is a solution x0 to the equation, we can probably find a computer representable number
x which approximately satisfies the equation (has low residual error) and which is close to x0 (has a low
approximation error). The following exercises use a plot to illustrate the existance of a solution.
We will make plots by constructing a pair of vectors of x-values and corresponding y-values and then
plotting the points with lines connecting them.
4 Exercise 1
The following steps show how to plot the functions y = cos x and y = x together in the same graph from
−π to π.
1. Import numpy and call it np;
2. Import matplotlib and call it plt;
3. Define a vector of x values with the appropriate range;
4. Define vectors y1 and y2 by evaluating functions at x;
5. Use the plot() command twice, for (x,y1) and (x,y2).
6. Use the show() command to make the plot appear;
7. Save a copy of the graph using the savefig() command.
A basic version of your Python script might look like this:
import m a t pl o tli b . p y pl o t a s p l t
import numpy a s np
x = np . l i n s p a c e ( −np . pi , np . pi , 100 )
y1 = x . copy ( ) # Python r e q u i r e s t h i s , i n s t e a d o f ” y1 = x ”
y2 = np . c o s ( x )
p l t . pl o t ( x , y1 )
p l t . pl o t ( x , y2 )
p l t . s a v e f i g ( ’ e x e r c i s e 1 . jp g ’ ) # O p t i on al command . Saves a copy o f t h e p l o t in a f i l e .
p l t . show ( )
You might save your script as On a computer running Linux, I can execute it by the
python3 e x e r c i s e 1 . py
When you see the plot, notice that the two lines intersect. You can even estimate the value of x where this
As a report for this exercise, state your estimated value of the root, and include a copy of your plot.
5 Exercise 2
Let’s reorganize our investigation. Instead of solving the equation cos(x) = x, we will construct a function
f(x) = cos(x) − x, and think of the equivalent problem of finding a root x such that f(x) = 0.
We begin by packaging f(x) as an actual Python function. Recall the structure of a Python function:
1. A declaration statement of the form:
def function name ( list of arguments ):
2. indented computational statements
3. indented a return statement that reports the function value;
Your Python function might be stored as, and have the form:
def cosmx ( x ) :
””” v al u e = cosmx ( x )
In pu t :
x : t h e argument
Output :
v al u e : t h e v al u e o f cos ( x )−x .
import numpy a s np
v al u e = np . c o s ( x ) − x
return v al u e
Let’s test your function at x = 0.5. Because it’s a function, not a script, we can’t execute it by a command
like “python3 cosmx” or “python3 cosmx(0.5)”. Instead, we have to enter an interactive Python session,
import the function, call it, and then quit. Here’s how I do it on an Ubuntu system:
>>from cosmx import cosmx
>>cosmx ( 0 . 5 )
>>q ui t ( )
Your result should be about 0.37758.
Now let’s use graphics again, but this time, we will plot y1 = cosmx(x) and y2 = 0.
1. From your previous script, make a fresh copy;
2. To access your function, you will need to insert the statement
from cosmx import cosmx
3. To create the y2 data, you can’t simply write y2=0, which creates a single zero value. Instead, use the
command y2 = np.zeros ( len ( x ) ) to create a zero vector that is the same length as x.
4. After your plot statements, you should add the statement plt.grid ( True ), so that we will see
helpful grid lines.
5. Make sure you modify your savefig() statement too!
To execute the program, I type
python3 e x e r c i s e 2 . py
Does the value of x at the point where the curve crosses the x-axis agree with the value you saw in the
previous exercise?
Send me the plot file as exercise2.jpg.
6 The Bisection Idea
The idea of the bisection method is very simple. We assume that we are given two values x = a and x = b
(a < b), and that the function f(x) is positive at one of these values (say at a) and negative at the other.
When this occurs, we say that [a, b] is a “change-of-sign interval” for the function. Assuming f is continuous,
we know that there must be at least one root in the interval.
Intuitively speaking, if you divide a change-of-sign interval in half, one or the other half must end up
being a change-of-sign interval, and it is only half as long. To figure out which half contains the root, simply
evaluate the function at the break point. Keep the half interval whose endpoint has a function value of
opposite sign to the midpoint function value. If you repeatedly chop the interval in half this way, you can
decrease the size of the change-of-sign interval below any specified tolerance.
To start thinking of the program we must write, consider the point x = (a + b)/2. If f(x) = 0 we are
done. (This is pretty unlikely.) Otherwise, depending on the sign of f(x), we know that the root lies in
[a, x] or [x, b]. In any case, our change-of-sign interval is now half as large as before. Depending on how the
interval split, replace the value of a or b by the value of x. Repeat this process with the new change of sign
interval [a, b] until it is sufficiently small.
We are guaranteed to converge. We can even compute the maximum number of steps this could take. If
the original change-of-sign interval has length ` then after one bisection the current change-of-sign interval
is of length `/2, and each subsequent step again halves the interval size. We know in advance how well we
will approximate the root x0. These are very powerful facts, which make bisection a robust algorithm—that
is, it is very hard to defeat it.
7 Exercise 3
1. If we know the start points a and b and the interval size tolerance , we can predict beforehand the
number of steps required to reach the specified accuracy. The bisection method will always find the
root in that number or fewer steps. What is the formula for that number?
2. Give an example of a continuous function which has only one root in the interior of the interval [-2, 1],
but for which our bisection method could not be used.
8 Programming Bisection
Before we look at a sample bisection program, let’s discuss some programming issues. If you haven’t done
much programming before, this is a good time to try to understand the logic behind how we choose variables
to set up, what names to give them, and how to control the calculation.
We plan to write the bisection algorithm as a function. This separates the algorithm from the data it
will work on. It allows us to concentrate on getting the algorithm right. And once we have gotten it right,
it will be encapsulated as a function, and can be used over and over on different problems.
Another reasonable recommendation for using functions is that by breaking up a calculation into independent portions, we can hide the algorithm details and temporary variables. The names and values of
variables used within the function will disappear once the function is complete, except for results reported by
the return statement. This reduces the chance that we will accidentally use the same name in two different
parts of a single huge program, which can cause confusion and error.
The bisection algorithm can be expressed in the following way:
Given a function f : R → R and a, b ∈ R so that f(a) · f(b) < 0, construct sequences a
(k) and b
k = 0, . . . with f(a
)· f(b
) < 0 for each k by starting out with a
(1) = a, b
(1) = b and then, for each k > 0,
1. Set x
(k) = (a
(k) + b
2. If |x
k − b
| is small enough, or if f(x
) = 0 exit the algorithm.
3. If f(x
) · f(a
) < 0, then set a
(k+1) = a
(k) and b
(k+1) = x
4. If f(x
) · f(b
) ≤ 0, then set a
(k+1) = x
(k) and b
(k+1) = b
5. Return to step 1.
Mathematically, it makes sense to think of the sequences a
k and b
. But computationally, we really only
need the current values of the left and right endpoints of our interval. Keeping the entire sequence just
clutters up our program with extra indices and data. So if we look at bisection computationally, we get
something like the following:
Algorithm Bisect(f,a,b,)
1. Set x:=(a + b)/2.
2. If |b − a| ≤ , or if f(x) happens to be exactly zero, then exit the function, returning the value x;
3. If sign(f(b)) * sign(f(x)) < 0, then a:=x; otherwise, b:=x.
4. Return to step 1.
The second version of the algorithm is expressed in a form that is easily translated into a loop, with an
explicit exit condition. To check the sign condition, the algorithm employs a sign() function rather than
looking at the value of f(a)∗f(b). This is a better strategy because of roundoff errors if very small quantities
are involved. In Python, there is a smallest positive real number, which we might nickname as “tiny”. Its
value can be returned by the Python value numpy.finfo(float).tiny. If f(a) > 0 and f(b) > 0 are each
less than √
tiny, then their product is zero, not positive.
The language of this description, an example of “pseudocode,” is based on a computer language called
Algol but is intended merely to be a clear means of specifying algorithms in books and papers. The term
“pseudocode” is used for any combination of a computer coding language with natural language. One
objective of this lab is to show how to convert pseudocode into a workable Python program. As a starting
point, let’s fix f(x) to be the function cosmx that you just wrote. Later you will learn how to add it to the
calling sequence. Let’s also fix  = 10−10
Here is a function that carries out the bisection algorithm for our cosmx() function.
def bi s e c t c o sm x ( a , b ) :
”””x , i tC o un t = b i s e c t c o sm x ( a , b )
b i s e c t c o sm x ( ) u se s b i s e c t i o n f o r a r o o t o f cosmx ( x ) = 0 .
In pu t :
a , b : l e f t and r i g h t ends o f a change o f s i g n i n t e r v a l .
Output :
x : t h e e s t im a t e d r o o t .
i tC o un t : t h e number o f b i s e c t i o n s .
from cosmx import cosmx
import numpy a s np
# I n i t i a l i z a t i o n .
EPSILON = 1. 0 e −10;
f a = cosmx ( a )
f b = cosmx ( b )
i tC oun t = 0
# Loop
while ( True ) :
x = ( a + b ) / 2. 0
f x = cosmx ( x )
print ( itCount , ’ : ’ , \
’ f ( ’ , a , ’ ) = ’ , fa , \
’ f ( ’ , x , ’ ) = ’ , fx , \
’ f ( ’ , b , ’ ) = ’ , f b )
i f ( f x == 0 ) :
return x , i tC oun t
e l i f ( abs ( b − a ) < EPSILON ) :
return x , i tC oun t
i tC oun t = i tC oun t + 1
i f ( np . si g n ( f a ) ∗ np . si g n ( f x ) <= 0 ) :
b = x
f b = f x
e l s e :
a = x
f a = f x
Note that in the expression fx == 0 a double equals sign is used. In Python, a single equal sign indicates
assignment, which might better have been written as <=. It transfers a value. Whenever we are asking
whether two items are equal, we need to use the double equal sign instead. This is a tricky distinction, and
if you make the wrong choice, sometimes the compiler will complain, but sometimes it will simply run and
compute nonsense.
Note the text at the beginning of the function, which is delimited by triple quotes. In Python, this
is known as a docstring. Because the function has a docstring, a user can ask for information about it
by typing help ( bisect cosmx ). You are encouraged to write a docstring for your Python functions.
However, Python is a little fussy about the format of a docstring:
• It must be the first item following the function statement;
• It must begin and end with triple double quotes;
• The first line should summarize the function;
• If there is more than one line, the second should be blank, and then you should describe the input and
output variables, and any other information a user might find helpful.
9 Exercise 4
Create the file bisect by typing it in yourself, or by using copy-and-paste, and then be prepared
to run it to answer some of the following questions.
1. In your own words, what does the np.sign(x) function do? What if x is 0?
2. The print() command is used only to monitor progress. Note the use of \ as the continuation
character, which allows the print() statement to extend beyond a single line.
3. What would the final value of itCount be if a and b are already closer together than the tolerance
when the function starts?
4. Try the command:
x , i tC oun t = bi s e c t c o sm x ( 0 . 0 , 3. 0 )
How many iterations were required? Is this value smaller than the value from your formula in Exercise
5. Why did I use abs(b-a) in the convergence test instead of simply writing b-a?
6. The program may seem to be working, but try this command:
x , i tC oun t = bi s e c t c o sm x ( 0 . 0 , 0. 5 )
What do you notice about the result? Apparently, in this case, the input values are not be suitable for
the bisection method. Rather than carry out a worthless computation, the function should check the
input and refuse to process it if it is illegal. Such a check can be made by code like this:
i f ( c o n di ti o n ) :
ra is e Excep ti on ( ”The i n p u t i s i l l e g a l ! ” )
Replace condition by the appropriate computational check. Perhaps you should also replace the error
message by a more understandable message. Rerun your program and verify that it will correctly
process the input (0, 3) and reject (0, 0.5)
10 Functions as Input
Now we have written the file bisect that can find a root of f(x)=cos(x)-x, but it can’t handle
any other function. Just as we are allowed to vary the endpoints a and b as input quantities, we’d like to be
able to be able to specify as an input quantity the function whose root we are seeking.
11 Exercise 5
Copy the file bisect to a new file named
1. In the declaration line for bisect(), add one more input argument. We don’t know what the user
calls the function, but we have to make up a name for it in the argument list, so we’ll just call it f.
2. If you used a docstring to list your input arguments, include a statement about your new input quantity
3. The import cosmx statement is no longer appropriate. Remove it.
4. There are three places where we evaluate the function with a statement like fa = cosmx ( a ) In
each such place, replace cosmx() by f().
Our first test of the new code simply repeats the analysis of cosmx(). However, now, we have to import
that function during the Python session, and then include it on the argument list to bisect(). On my
system, this might look like:
>>from cosmx import cosmx
>>from b i s e c t import b i s e c t
>>x , i tC oun t = b i s e c t ( 0 , 3 , cosmx )
>> q ui t ( )
For our second test, we need a new function. Using the model of, write a Python function called that evaluates the function: f(x) = (x − 5)(32x − 7)(x + 1). Call bisect() using the initial change
of sign interval [0, 1]. We have already seen that we can predict the maximum number of steps necessary to
shrink the interval below the limit of EPSILON. However, for this example, the program may stop early, and
successfully. Can you explain?
12 Exercise 6
The following table contains formulas and search intervals for several functions. Write function files,
through, and use bisect() to find a root in the given interval.
Name Formula Interval approxRoot No. Steps
f1 x^2-9 [0,5] _________ _________
f2 x^5-x-1 [1,2] _________ _________
f3 x*exp(-x) [-1,2] _________ _________
f4 2*cos(3*x)-exp(x) [0,6] _________ _________
f5 (x-1)^5 [0,3] _________ _________
13 The secant method
The bisection algorithm allows the user to predict in advance the number of iterations necessary to reach
the interval tolerance. For root-finding algorithms that do not simply shrink the interval, it can be harder
to determine whether the computation should be continued. One common, but not always reliable, measure
is the magnitude of the residual, which we might compare to a tolerance as in |f(x)| ≤ ). Our next root
finder algorithm, the secant method, will use a test like this for convergence.
Just like the bisection method, the secant method starts out with two points a and b. It does not require
that these points constitute a change of sign interval. The algorithm simply evaluates the function at those
two values, constructs the straight line though (a,f(a)) and (b,f(b)) and sets c to the argument where
the straight line crosses the axis. The formula is simply:
c =
f(b)a − f(a)b
f(b) − f(a)
If f() was actually a linear function, this would be the root. In general, it’s not going to be the root. We
can’t even be sure that f(c) is smaller than f(a) or f(b). Nonetheless, we now discard a. We use the pair of
values b and c to construct a new linear function, and we find its zero at d.
We can continue this process indefinitely, but of course we hope that the sequence of values is converging
to a root, and we want to know whether we are getting close, or whether we need to cancel the process
because we have taken too many steps, or the function values are blowing up.
The secant method can be described in the following manner.
Algorithm Secant(f,a,b,)
1. Set
x =
f(b) a − f(a) b
f(b) − f(a)
2. If |f(x)| ≤ , return x as the approximate root.
3. Replace a with b and b with x.
4. Go back to step 1.
When the secant method converges, it is usually much faster than bisection. But if the initial values
are far from the root, the method can converge very slowly, or even diverge to infinity. Unlike bisection,
the secant method can be generalized to two or more dimensions, and the generalization is usually called
Broyden’s method.
14 Exercise 7
Write a Python function named You might start from the text of The function should
begin as follows:
def s e c a n t ( a , b , f ) :
””” s e c a n t ( a , b , f ) a p p l i e s t h e s e c a n t method f o r a r o o t o f f ( x )
# I n i t i a l i z e
? ? ? Se t fa , f b ? ? ?
EPSILON = 1. 0 e−10
ITMAX = 1000
# Loop
fo r i tC oun t in range ( 0 , ITMAX ) :
? ? ? Determine x , e v al u a t e f ( x )
i f ( c o n di ti o n )
return x , i tC oun t
? ? ? a r e pl a c e d by b , f a r e pl a c e d by fb ,
? ? ? b r e pl a c e d by x , and f b r e pl a c e d by f x
# E x i t e d from l o o p w i t h o u t c onvergence
ra is e Excep ti on ( ” s e c a n t : No c o n v e r g e n c e a f t e r maximum number o f i t e r a t i o n s . ” )
You will need to replace the sections marked ??? by the appropriate Python statements. Notice that our
loop section now uses a Python for statement, set up to run no more than 1000 iterations. If we don’t leave
the loop with convergence, then we trigger a final error message.
Compare the bisection and secant methods by filling in the following table.
Bisection Bisection Secant Secant
Name Formula [a,b] approxRoot No. Steps approxRoot No. Steps
f1 x^2-9 [0,5] _________ _________ _________ _________
f2 x^5-x-1 [1,2] _________ _________ _________ _________
f3 x*exp(-x) [-1,2] _________ _________ _________ _________
f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________
f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________
f5 (x-1)^5 [0,3] _________ _________ _________ _________
In the above table, you can see that the secant method can be either faster or slower than bisection. You
may also observe convergence failures: either convergence to a value that is not near a root or convergence
to a value substantially less accurate than expected. Regarding the bisection roots as accurate, are there
any examples of convergence to a value that is not near a root in the table? If so, which? Are there any
examples of inaccurate roots? If so, which?
15 The Regula Falsi method
The regula falsi method is a sort of hybrid of bisection and the secant method. We require that the user
supplies a starting change-of-sign interval. Each step of the method reduces the size of the change-of-sign
interval, so the iterates should not be able to diverge. Regula falsi can be faster than bisection, and “safer”
than the secant method; however it is possible to find functions for which regula falsi does worse than either
of the other two methods.
Assuming we begin with a change of sign interval [a, b], the algorithm can be described as follows:
Algorithm Regula(f,a,b,)
1. Set
x =
f(b) ∗ a − f(a) ∗ b
f(b) − f(a)
2. If |f(x)| ≤ , return x.
3. If f(x) has the same sign as f(a), replace a by x, otherwise replace b by x.
4. Return to step 1.
16 Exercise 8
Starting from your file, write a Python file named to carry out the regula falsi algorithm.
Fill in the following table, to compare Bisection, Secant, and Regula Falsi.
Bisection Secant Regula Falsi
Name Formula Interval approxRoot No. Steps No. Steps No. Steps
f1 x^2-9 [0,5] _________ _________ _________ _________
f2 x^5-x-1 [1,2] _________ _________ _________ _________
f3 x*exp(-x) [-1,2] _________ _________ _________ _________
f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________
f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________
f5 (x-1)^5 [0,3] _________ _________ _________ _________
You should observe both faster and slower convergence, compared with bisection and secant. You should
not observe lack of convergence or convergence to an incorrect solution, except for function f5().
Function f5, (x-1)^5 turns out to be very difficult for regula falsi. Loosen your convergence criterion to
a tolerance of EP SILON = 10−6 and increase the maximum allowable number of iterations to ITMAX =
500, 000 and try again.
Name Formula Interval approxRoot No. Steps
f5 (x-1)^5 [0,3] _________ _________
You should observe convergence, but it takes a very large number of iterations.
regula() and bisect() both keep the current iterate, x in a change-of-sign interval. Why would it be
wrong to have the regula() function use the same convergence criterion as bisect() used?