## Description

## Problem 1 Gradient Descent Method. (50 pts)

Implement the gradient descent method to solve the optimization problem:

min

x∈R2

f(x) = e

1−x1−x2 + e

x1+x2−1 + x

2

1 + x1x2 + x

2

2 + x1 − 3×2.

The following input parameters should be considered:

• x

0 = (0, 0)⊤ – the initial point.

• tol = 10−5 – the tolerance parameter. The method should stop whenever the current iterate

x

k

satisfies the criterion ∥∇f(x

k

)∥ ≤ tol.

• σ, γ =

1

2

– parameters for backtracking and the Armijo condition.

Implement different methods using the given parameter.

a) Implement the gradient descent method using constant stepsize. Choose α1 = 1 and α2 = 0.1,

report the number of iterations, the final objective function value, and the point to which the

methods converged. (15 pts)

b) Implement the gradient descent method using backtracking (Armijo) line search method,

report the number of iterations, the final objective function value, and the point to which the

methods converged. (15 pts)

c) Plot figures of the solution path for each of the different step size strategies. (20 pts)

Please include (1) essential parts of code (2) output results and figures in your answer.

Your plots should look similar to Figure 1.

1

Figure 1: Solution path using constant step for α2 = 0.1 and backtracking, respectively.

## Problem 2 Newton’s Method. (30 pts)

Implement the Newton’s method to solve the same problem in Problem 1.

Use only Armijo backtracking line search. Report the number of iterations, the final objective

function value, the point to which the methods converged. (10 pts)

Plot figures of the solution path. (10 pts)

Hint: You could use some implemented functions to calculate matrix inversion. And

your plot should look similar to Figure 2.

Figure 2: Solution path using Newton’s Method.

## Problem 3 Projection onto Convex Sets. (20 pts)

In the class, we have seen interesting examples of projection onto some convex sets, described by

linear constraints, box constraints, and ball constraints (Page 8, Lec 23 of Prof.Pu or Page 17, Lec

24 of Prof.Wu).

Another important convex set, ∆d, is standard simplex (or so-called probability simplex), where

∆d :=

x ∈ R

d

:

X

d

i=1

xi = 1, xi ≥ 0 ∀i

Obviously, it is a convex set.

Now, you are supposed to prove the following statement.

Let x

∗

:= argminx∈∆d

∥x − v∥

2

2

, which means we project a point v in R

d onto a standard simplex.

Under the assumption that v1 ≥ v2 ≥ v3 ≥ · · · ≥ vd, there exists p ∈ {1, 2, · · · , d}, which is also

unique, such that

x

∗

i > 0 i ≤ p

x

∗

i = 0 i > p