A New SR1 Formula for Solving Nonlinear Optimization.pptx

A New SR1 Formula for Solving Nonlinear
Optimization Problems
By:
Dastan A. Haji
Fathila A. Taher
Supervisor:
Alaa L. Ibrahim
Ministry of Higher Education and Scientific
Research
University of Zakho
Faculty of Science
Department of Mathematics

Proposal submitted in partial fulfilment of the requirements of
Zakho university for the degree of bachelor of mathematics
Science department

Abstract
The Quasi-Newton method plays a main role in iterative process for solving
nonlinear unconstrained optimization problems. In this proposal, we will
suppose the new correction of a quasi-Newton symmetric rank-one (SR1)
updating formulas for solving unconstrained optimization based on the step
size of the Barzilai-Borwein.

Introduction
We deal with a class of methods for local unconstrained minimization, i.e., for
finding a local minimum point 𝑥∗ ∈ ℝ𝑛 such that
𝑓 𝑥∗
= min
𝑥∈ℝ𝑛
𝑓(𝑥 , (1
where 𝑓: ℝ𝑛 → ℝ is a twice continuously differentiable objective function and 𝑥 is
an 𝑛-dimensional vector space. The conjugate gradient method and the quasi-
Newton methods (QN) are methods that only use the first derivative of the
function.

They are therefore often used in applications when only the first derivative is
known, or when higher derivatives are very expensive to calculate. In some
applications the optimal solution might not be of particular interest, instead the
interest might be the behaviour of a particular method. The methods can be
used in a great variety of fields, some examples from different areas are given
below.

In electromagnetics, problems for distributed parameter estimation can be solved,
as in [1], where a constructed update matrix for QN is taking advantage of the
special structure of the problem. Optical tomography is another area where these
techniques are used [2]. These two methods can also be used in programs of
neural network, see, e.g., [3-4].

Finally, the methods can be used in radiation therapy optimization [5], from which
this study has evolved. Preconditioning is an important technique used to develop
an efficient conjugate gradient method solver for challenging problems in scientific
computing [6]. The preconditioned conjugate gradient (PCG) method is one of the
most effective tools for solving large unconstrained optimization.

Problem statement
In this fragment we will focus on deriving the modified SR1 algorithm for the
unconstrained optimization. The step length of the Barzilai-Borwein is defined
as 𝛼𝑘
𝐵𝐵
=
𝑣𝑘
𝑇
𝑣𝑘
𝑦𝑘
𝑇𝑣𝑘
, for more details see [7].
Now, by suppose that the Quasi-Newton equations is defined as the following
formula:
𝐻𝑘+1
𝑁𝑒𝑤
𝑦𝑘 = 𝑣𝑘 (2)

In the SR1 formula, the correction term is symmetric and has the form
𝛼𝑘𝐿𝑘𝐿𝑘
𝑇
where 𝛼𝑘 ∈ 𝑅 , 𝑡 ≥ 0 and 𝐿𝑘 ∈ 𝑅𝑛×𝑛. Therefore, the update equation is
𝐻𝑘+1
𝑁𝑒𝑤
= 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘 + 𝛼𝑘𝐿𝑘𝐿𝑘
𝑇
(3)
Our goal now is to determine 𝛼𝑘 and 𝐿𝑘 by multiplying both side of above
equation by 𝑦𝑘 and by using equation (2) we get to
𝐻𝑘+1
𝑁𝑒𝑤
𝑦𝑘 = 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘 + 𝛼𝑘𝐿𝑘𝐿𝑘
𝑇
𝑦𝑘 = 𝑣𝑘

Since 𝐿𝑘
𝑇
𝑦𝑘 is scalar. So, we have
𝑣𝑘 − 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘 = 𝛼𝑘𝐿𝑘
𝑇
𝑦𝑘 𝐿𝑘 (4
and hence 𝐿𝑘 =
𝑣𝑘−𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
𝛼𝑘𝐿𝑘
𝑇𝑦𝑘
. So,
𝑇
=
(𝑣𝑘 − 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘 𝑣𝑘 − 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
𝑇
𝛼𝑘 𝐿𝑘
𝑇
𝑦𝑘
2
Now multiply (4) by 𝑦𝑘
𝑇
we obtained
𝑦𝑘
𝑇(𝑣𝑘 − 𝐻𝑘𝑦𝑘 = (𝛼𝑘𝐿𝑘
𝑇
𝑦𝑘 𝑦𝑘
𝑇𝐿𝑘

Observe that 𝛼𝑘 is scalar and 𝑦𝑘
𝑇
𝐿𝑘 = 𝐿𝑘
𝑇
𝑦𝑘 , then the above equation
becomes
𝑦𝑘
𝑇 𝑣𝑘 − 𝐻𝑘𝑦𝑘 = 𝛼𝑘 𝐿𝑘
𝑇
𝑦𝑘
2
(6
By putting equation (6) in equation (5) we have,
𝑇
=
(𝑣𝑘−𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘 (𝑣𝑘 − 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
𝑇
𝑦𝑘
𝑇(𝑣𝑘−𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
(7
Then
𝐻𝑘+1
𝑁𝑒𝑤
= 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘 +
(𝑣𝑘−𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘 (𝑣𝑘 − 𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
𝑇
𝑦𝑘
𝑇(𝑣𝑘−𝑡𝛼𝑘
𝐵𝐵
𝐻𝑘𝑦𝑘
(8

Question research
Through the work we will also try to answer some questions:
1. What is the benefit of the new suggestions?
2. Dose the new SR1 formula satisfy Quasi-Newton equations?
3. Dose the new SR1 formula satisfy positive definite property?
4. How Will Shown the New SR1 method is more efficient than the standard
method?

Literature review
Some Basic Definitions [8,9,10]
Definition 5.1: Optimization means finding the best solution for a given
problem. Mathematically, this means finding the minimum or the maximum
value of a function of 𝑛 variables , 𝑓(𝑥1, 𝑥2, … , 𝑥𝑛 , where 𝑛 may be any
integer greater than zero.

Definition 5.2: By unconstrained optimization we mean that a function 𝑓(𝑥
which is minimized subject to no constrained.
Definition 5.3: A point 𝑥∗
∈ 𝑅𝑛
is said to be a stationary (or critical) point for
the differentiable 𝑓 if 𝑓 𝑥∗
= 0.
Figure (1) Stationary Points.

Definition 5.4: Let 𝑥∗ be an interior point of a region 𝑅 at which 𝑓 has a local
minimum or local maximum, if 𝑓 is differentiable at 𝑥∗ then 𝛻𝑓 𝑥∗ =0. This is
necessary condition for 𝑓 to be minimized. Also, if 𝑓 is twice continuously
differentiable and 𝛻𝑓 𝑥∗ = 0; 𝑧𝑇 𝛻2𝑓 𝑥∗ 𝑧 > 0 for any non-zero vector
𝑧, then 𝑓 has a local minimum at 𝑥∗.This is the sufficient condition to be
minimized.

Definition 5.5: The point 𝑥∗ is said to be the global minimizer of the function
𝑓(𝑥 if 𝑓(𝑥∗
≤ 𝑓(𝑥 for all 𝑥 ∈ 𝑅𝑛
. The value of 𝑓(𝑥∗
is the corresponding
global minimum. It is called a local minimizer of the function 𝑓(𝑥 , if there
exists a small positive number 𝜀 such that 𝑓(𝑥∗ ≤ 𝑓(𝑥 for all 𝑥 ∈ 𝑅𝑛, which
satisfies 𝑥∗ − 𝑥 < 𝜀. The value of 𝑓(𝑥∗ is the corresponding local minimum.
Figure (2) Global and Local Minimum.

Definition 5.6: A set 𝐶 is convex if the line segment between any two points in 𝐶 lies
in 𝐶, i.e. if for any 𝑥1, 𝑥2 ∈ 𝐶and any 𝜃 with 0 ≤ 𝜃 ≤ 1, we have 𝜃𝑥1 + (1 −
𝜃 𝑥2 ∈ 𝐶
Figure (3) Some sample of convex and non-convex sets

Definition 5.7: If for any non-zero vector 𝑥 in 𝑅𝑛
, we have the form
𝑥𝑇𝐺𝑥 > 0, then 𝐺 is called a positive definite matrix.
Definition 5.8: The quadratic function 𝑓 𝑥 =
1
2
𝑥𝑇𝐺𝑥 + 𝑥𝑇𝑏 + 𝑐 where 𝑐
is a scalar, 𝑏 is a constant vector and 𝐺 is a positive definite symmetric matrix,
has a minimum at the point 𝑥∗ where 𝑥∗ = −𝐺−1𝑏.

Definition 5.9: The definition of the quadratic termination is that the method
will locate 𝑥∗of a quadratic function in a finite number of iterations.
Definition 5.10:The condition number of a positive definite matrix to be a ratio
of its largest to smallest eigenvalue of the Hessian matrix at the minimum, 𝐾 =
𝜆𝑚𝑎𝑥
𝜆𝑚𝑖𝑛
.

Definition 5.11: Each iteration of a line search method computes a search
direction 𝑑𝑘 and then decides how far to move along that direction. The iteration is
given by
𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘𝑑𝑘
where the positive scalar 𝛼𝑘 is called the step length. The success of a line search
method depends on effective choices of both the direction 𝑑𝑘 and the step length
𝛼𝑘. Most line search algorithms require 𝑑𝑘 to be a descent direction i.e. 𝑑𝑘
𝑇
𝑔𝑘 < 0
for all 𝑘 because this property guarantees that the value of the function 𝑓 can be
reduced along this direction.

Definition 5.12: The optimization algorithm is said to be exact line search if
it satisfies:
𝑔𝑘+1
𝑇
𝑑𝑘 = 0 , 𝑘 = 0, 1, 2, … , 𝑛,
where 𝑔 is the gradient of 𝑓(𝑥 and 𝑑 is search direction. It is in exact line
search if: 𝑔𝑘+1
𝑇
𝑑𝑘 ≠ 0 , 𝑘 = 0, 1, 2, … , 𝑛.

Quasi-Newton methods revolutionized non-linear optimization in the 1960
because they avoid costly computations of Hessian matrices and perform well in
practice. Several kinds of them have been proposed, but since 1970's the BFGS
method has become more and more popular, and today it is accepted as the best
QN method [11], which defines the search direction as:
𝑑𝑘 = −𝐻𝑘𝑔𝑘 (9)
Overview of Quasi-Newton (QN) Methods

Where 𝐻𝑘 is an approximation of 𝐺𝑘
−1
and satisfies QN-condition:
𝐻𝑘+1𝑦𝑘 = 𝑣𝑘 (10
Where 𝑣𝑘 = 𝑥𝑘+1 − 𝑥𝑘 = 𝛼𝑘𝑑𝑘 , and 𝑦𝑘 = 𝑔𝑘+1 − 𝑔𝑘 .
Many modulations have been applied on QN-methods in bid to improve its
performance. In 1974 Oren [12]develops the self-scaling VM-algorithms, Oren's
formula can be written as:
𝐻𝑘+1 = 𝐻𝑘 −
𝐻𝑘𝑦𝑘𝑦𝑘
𝑇
𝐻𝑘
𝑦𝑘
𝑇
𝐻𝑘𝑦𝑘
+ 𝜗𝑅𝑘𝑅𝑘
𝑇
𝛾𝑘 +
𝑠𝑘
𝑇
𝑠𝑘
𝑠𝑘
𝑇
𝑦𝑘
(11

Where 𝛾𝑘 =
𝑠𝑘
𝑇
𝑦𝑘
𝑦𝑘
𝑇
𝐻𝑘𝑦𝑘
and 𝜗 = 1
Clearly when 𝛾𝑘 = 1, formula (11) reduces to Broyden’s class update. Also, to
improve the performance of VM updates Biggs [13] proposed to choose 𝐻𝑘+1 to
satisfy the following modified equation 𝐻𝑘+1𝑦𝑘 = 𝜖𝑘𝑠𝑘, where 𝜖𝑘 > 0 is a scaling
parameter. The modified BFGS may be written as:
𝐻𝑘+1 = 𝐻𝑘 +
𝐻𝑘𝑦𝑘𝑠𝑘
𝑇
+ 𝑠𝑘𝑦𝑘
𝑇
𝐻𝑘
𝑠𝑘
𝑇
𝑦𝑘
+
1
𝜏𝑘
+
𝑦𝑘
𝑇
𝐻𝑘𝑦𝑘
𝑠𝑘
𝑇
𝑦𝑘
𝑠𝑘
𝑇
𝑠𝑘
𝑠𝑘
𝑇
𝑦𝑘
(12

where 𝜏𝑘 =
1
𝜖𝑘
=
6
𝑠𝑘
𝑇𝑦𝑘
ℎ 𝑥𝑘 − ℎ 𝑥𝑘+1 + 𝑠𝑘
𝑇
𝑔𝑘+1 − 2
also S. Shareef, and A. Ibrahim [14] made a modification for self-scaling symmetric
rank one update with QN condition 𝐻𝑘+1𝑦𝑘 = 𝜏𝑠𝑘 as follow
𝐻𝑘+1 = 𝐻𝑘 +
𝜏𝑠𝑘 − 𝐻𝑘𝑦𝑘 𝜏𝑠𝑘 − 𝐻𝑘𝑦𝑘
𝑇
𝜏𝑠𝑘 − 𝐻𝑘𝑦𝑘
𝑇𝑦𝑘
(13
where 𝜏 = 𝑡 1 + 1 − 𝜗 𝜌𝑘 , 𝑡 ≥ 0, 𝜗 ∈ (0,1 and 𝜌𝑘 =
𝑠𝑘
𝑇
𝑦𝑘
𝑠𝑘
2 .

Objective of the study
The main goal of our project is to obtained a new Quasi-Newton method to
solve the unconstrained optimization problems depend on standard QN-
method (SR1) and step size of Barzilai-Borwein with some hypothesis.

Method of research
This work consists of two chapter. The first chapter includes a general
introduction of numerical unconstrained optimization problems. In Chapter
two, we will present the Quasi-Newton methods with derivation of a new
Quasi-Newton Method (SR1). In addition to answer and discussing the
problem of the SR1 method about the Quasi-Newton condition and positive
definite matrix. Finally, we show a performance of the new method
numerically.

References
[1].E. Haber. Quasi-Newton methods for large-scale electromagnetic inverse problems.
Inverse Problems, 21(1):305–333, 2005.
[2].A. D. Klose and A. H. Hielscher. Quasi-Newton methods in optical tomographic
image reconstruction. Inverse Problems, 19(2):387–409, 2003.
[3].J. W. Denton and M. S. Hung. A comparison of nonlinear optimization methods
for supervised learning in multilayer feedforward neural networks. European Journal
of Operational Research, 93:358–368, 1996.

[4] A. L. Ibrahim and M. G. Mohammed, “A new three-term conjugate gradient method for
training neural networks with global convergence” Indonesian Journal of Electrical
Engineering and Computer Science, Vol. 28, No. 1, October 2022, pp. 547~554. DOI:
http://doi.org/10.11591/ijeecs.v28.i1.pp551-558
[5].F. Carlsson and A. Forsgren. Iterative regularization in intensity-modulated radiation
therapy optimization. Medical Physics, 33(1):225–234, 2006.
[6]. M. Benzi, Preconditioning Techniques for Large Linear Systems: A Survey, Journal of
Computational Physics, 182, pp. 418-477, 2002.
[7]. Barzilai, J. and Borwein, J.M. (1988), Tow point step size gradient methods, IMA J. Numer.
Anal., 8, 141-148.

[8].Fletcher R. (1987), Practical Methods of Optimization, Wiley , Chi Chester.
[9].Nocedal J. and Wright S. J., (2006), Numerical Optimization, Springer Series in
Operations Research, 2nd Edition, Springer Verlag, New York.
[10].Sun W. and Yuan Y., (2006), Optimization Theory Methods Nonlinear
Programming, Springer Science Business Media, LLC. New York, USA
[11].Walter F., (2004), The BFGS with exact line searches fails for non-convex
objective functions, Math. Program., ser. A99

[12].Oren S. S. (1974), On the selection of parameters in self-scaling variable metric
algorithm, Mathematical Programming, 3, 351-367.
[13].Biggs M. C. (1973), A note on minimization algorithms which make use of non-
quadratic properties of objective function, J. Inst. Maths Applics, 12, 337-338.
[14]. Shareef S. and Ibrahim A., (2019), A New Quasi-Newton (SR1) With PCG
Method for Unconstrained Nonlinear Optimization, International Journal of
Advanced Trends in Computer Science and Engineering, 8, 3124 – 3128.

A New SR1 Formula for Solving Nonlinear Optimization.pptx

A New SR1 Formula for Solving Nonlinear Optimization.pptx

More Related Content

Similar to A New SR1 Formula for Solving Nonlinear Optimization.pptx (20)

More from MasoudIbrahim3 (20)

Recently uploaded (20)

A New SR1 Formula for Solving Nonlinear Optimization.pptx