Skip to main content

On three-term conjugate gradient method for optimization problems with applications on COVID-19 model and robotic motion control

This article has been updated

Abstract

The three-term conjugate gradient (CG) algorithms are among the efficient variants of CG algorithms for solving optimization models. This is due to their simplicity and low memory requirements. On the other hand, the regression model is one of the statistical relationship models whose solution is obtained using one of the least square methods including the CG-like method. In this paper, we present a modification of a three-term conjugate gradient method for unconstrained optimization models and further establish the global convergence under inexact line search. The proposed method was extended to formulate a regression model for the novel coronavirus (COVID-19). The study considers the globally infected cases from January to October 2020 in parameterizing the model. Preliminary results have shown that the proposed method is promising and produces efficient regression model for COVID-19 pandemic. Also, the method was extended to solve a motion control problem involving a two-joint planar robot.

Introduction

Consider the following optimization model:

$$ \min f(x), \quad x \in \mathbb{R}^{n}, $$
(1.1)

where \(f:\mathbb{R}^{n} \to \mathbb{R}\) is a smooth function whose gradient \(\nabla f(x)=g(x)\) is available. Problems of the form (1.1) can be traced to many professional fields of science, astronomy, engineering, economics, and many more (see, for example, [1, 2]). Throughout this paper, we shall abbreviate \(g(x_{k})\) and \(f(x_{k})\) by \(g_{k}\) and \(f_{k}\), respectively. Also, \(\|\cdot\|\) represents the Euclidean norm of vectors.

The nonlinear CG methods play an important role in solving large-scale optimization models due to the modesty of their memory requirements and nice convergence properties. Generally, the iterates of the CG methods are usually determined through the following recursive computational scheme:

$$ x_{k+1}=x_{k}+s_{k}, \quad s_{k}=t_{k} d_{k},\quad k \geq 0, $$
(1.2)

where \(t_{k}\) is the step-size computed along the search direction \(d_{k}\). For the first iteration, \(d_{0}\) is always the steepest descent direction, that is, \(d_{0}=-g_{0}\) [3]. However, subsequent directions are recursively determined by

$$ d_{k}=-g_{k}+\beta _{k} d_{k-1},\quad k\geq 1, $$
(1.3)

where the scalar \(\beta _{k}\) is known as the CG coefficient whose different form determines a different CG methods.

The following line search procedures have been used in the convergence analysis and implementations of the already existing CG methods [4]. The convergence analysis often requires the line search to satisfy the exact line search, the Wolfe or strong Wolfe (SWP) line search. The exact line search requires the step-size \(t_{k}\) to satisfy

$$ f(x_{k}+t_{k} d_{k} ):=\min _{t \geq 0} f(x_{k}+t d_{k}). $$
(1.4)

The standard line search requires computing \(t_{k}\) such that the cost function is minimized along \(d_{k}\) satisfying

$$\begin{aligned} & f(x_{k}+t_{k} d_{k} )\leq f(x_{k} )+\delta t_{k} g_{k}^{T} d_{k}, \end{aligned}$$
(1.5)
$$\begin{aligned} & g(x_{k}+t_{k} d_{k})^{T} d_{k}\geq \sigma g_{k}^{T} d_{k}. \end{aligned}$$
(1.6)

The SWP is to compute \(t_{k}\) satisfying (1.5) and

$$ g(x_{k}+t_{k} d_{k})^{T} d_{k}\leq -\sigma \bigl\vert g_{k}^{T} d_{k} \bigr\vert , $$
(1.7)

where \(0<\delta <\sigma <1\).

Presently, there are several known formulas for different CG parameters (see [410]). One of the most efficient algorithms among the well-known formulas is the PRP [7, 8] defined by

$$ \beta _{k}^{\mathrm{PRP}}=\frac{g_{k}^{T} y_{k-1}}{ \Vert g_{k-1} \Vert ^{2}}, $$
(1.8)

where \(y_{k-1}=g_{k}-g_{k-1}\). From the computational point of view, the PRP algorithm performs better than most CG algorithms, and the convergence result has been established under some line search procedures. However, for a general function, the PRP method fails with regard to the global convergence under the Wolfe line search procedure. This is because the direction of search \(d_{k}\) is not descent for a general objective function [4]. This problem inspired numerous researchers to study the global convergence of PRP method under inexact line search. Interestingly, considering the general function, Yuan et al. [11] proved the global convergence of PRP method using a modified Wolfe line search procedure. More practical approaches of the line search have been employed to identify a step-size capable of achieving adequate reduction in the objective function \(f(x)\) at minimal cost.

Recently, Rivaie et al. [12] proposed a variant of PRP method by replacing the term \(\|g_{k-1}\|^{2}\) in the denominator of PRP with \(\|d_{k-1}\|^{2}\) as follows:

$$ \beta _{k}^{\mathrm{RMIL}}=\frac{g_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}, $$
(1.9)

and showed that the method converges globally under the exact line search. However, Dai [13] pointed out a wrong inequality used in the convergence result of RMIL method and suggested some necessary corrections as follows:

$$ \beta _{k}^{\mathrm{RMIL}+}= \textstyle\begin{cases} \frac{g_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}, & \text{if } 0\leq g_{k}^{T} g_{k-1}\leq \Vert g_{k} \Vert ^{2}, \\ 0, &\text{otherwise}, \end{cases} $$
(1.10)

and further established the global convergence under the exact line search. Preliminary results have been presented using the same benchmark test problems with different initial guess to illustrate the efficiency of the modified method. More recently, Yousif [14] modified the work of Dai [13] and showed that RMIL+ converges globally under the strong Wolfe line search. For more reference on the convergence analysis of the CG method, please refer to the following references [1519].

It is worthy to note that the sufficient descent property

$$ g_{k}^{T} d_{k}\leq \lambda \Vert g_{k} \Vert ^{2}, \quad \lambda >0, $$
(1.11)

plays a crucial role in the convergence analysis of the CG methods including the RMIL method. In this regard, several variants of the CG methods have been defined to satisfy (1.11) independent of the line search technique used.

One of the efficient variants of the CG methods is the three-term CG method where the search direction \(d_{k}\) contains three terms. One of the classical three-term methods was proposed by Beale [20], using the coefficient \(\beta _{k}^{\mathrm{HS}}\) [5]. The author constructed a new direction of search as follows:

$$ d_{k}=-g_{k}+\beta _{k}d_{k-1}+\gamma _{k} d_{t}, $$

where \(d_{t}\) is the restart direction and

$$ \gamma _{k}= \textstyle\begin{cases} 0, & \text{if } k=t+1, \\ \frac{g_{k}^{T} y_{t}}{d_{t}^{T} y_{t}},& \text{if } k>t+1. \end{cases} $$

The performance of this method was improved using an efficient restart strategy developed by McGuire [21]. The first three-term PRP algorithm (TTPRP) was defined by Zhang et al. [22] with the formula given as

$$ d_{k}=-g_{k}+\beta _{k} d_{k-1}+\theta _{k-1} y_{k-1}, $$

where \(\beta _{k}\) is the PRP method defined in (1.8) and \(\theta _{k}=- \frac{g_{k}^{T} d_{k-1}}{g_{k-1}^{T} g_{k-1}}\). An attractive feature of this method is that the descent condition

$$ g_{k}^{T} d_{k}\leq - \Vert g_{k} \Vert ^{2}, $$
(1.12)

holds independent of any line search, and the global convergence was established under a modified Armijo line search. Based on the structure of TTPRP, Liu et al. [23] extended the coefficient of RMIL (1.9) to defined a three-term CG method known as TTRMIL with formula as follows:

$$ d_{0}=-g_{0}, d_{k}=-g_{k}+ \beta _{k} d_{k-1}+\theta _{k} y_{k-1}, \quad k\geq 1, $$
(1.13)

where \(\beta _{k}\) is defined by (1.9) and \(\theta _{k}=-\frac{g_{k}^{T} d_{k-1}}{\|d_{k-1}\|^{2}}\).

The global convergence of this method was proved under the standard Wolfe line search. However, the proposed TTRMIL method in (1.13) employed the RMIL method; Dai [13] pointed out some errors in the convergence result and suggested some correction given in [14]. Motivated by this, we propose a modification of TTRMIL in the next section. For more references about the three-term CG method, interested readers may refer to [2427].

The rest of the paper would be structured as follows. In the next section, a modified TTRMIL method is given with its algorithm. The sufficient descent property and the global convergence of the new modification are studied in Sect. 3. Preliminary results based on some unconstrained optimization problems are presented to illustrate the performance of the method in Sect. 4. The proposed modification was extended to formulate a parameterized model for cases of COVID-19 in Sect. 5. In Sect. 6, the application in motion control is presented. Finally, the concluding remark and some recommendations of the study are presented in Sect. 7.

TTRMIL+ method and its algorithm

Motivated by the comments made by Dai [13] on the convergence of RMIL method, as discussed in the preceding section, we propose a modified TTRMIL, named TTRMIL+, by replacing \(\beta _{k}\) in (1.13) with the \(\beta _{k}\) given in (1.10) as follows:

$$ d_{k}= \textstyle\begin{cases} -g_{k}, &k=0, \\ -g_{k}+\beta _{k}d_{k-1}+\theta _{k}y_{k-1}, &k\geq 1, \end{cases} $$
(2.1)

where

$$ \theta _{k}=-\frac{g_{k}^{T} d_{k-1}}{d_{k-1}^{T} d_{k-1}}. $$
(2.2)

From (1.13) and (2.2), it is obvious that the difference between these two methods is the CG parameter \(\beta _{k}\) employed by each method in defining their search directions \(d_{k}\). This is a little change that has a great impact in the convergence analysis of RMIL+. It is interesting to note that the TTRMIL+ reduces to the classical RMIL+ method under the exact minimization condition. The following algorithm describes the proposed TTRMIL+.

Algorithm 1

The modified TTRMIL+ algorithm.

  1. Stage 0.

    Given \(x_{0} \in \mathbb{R}^{n}\), \(d_{0}=-g_{0}=-\nabla {f_{0}}\), set \(k:=0\).

  2. Stage 1.

    Check if \(\|g_{k}\|\leq \epsilon \), then stop.

  3. Stage 2.

    Compute \(t_{k}\) using (1.5) and (1.6).

  4. Stage 3.

    Update the new point based on (1.2). If \(\|g_{k}\|\leq \epsilon \), terminate the process.

  5. Stage 4.

    Compute \(\beta _{k}\) by (1.10) and update \(d_{k}\) by (2.1).

  6. Stage 5.

    Go to Stage 2 with \(k:=k+1\).

The following assumptions are very important and usually required in the convergence analysis of most CG algorithms.

Assumption 2.1

  1. (A1)

    The level set \(\Omega =\{x\in \mathbb{R}^{n}|f(x)\leq f(x_{0})\}\) is bounded, where \(x_{0}\) is an arbitrary initial point.

  2. (A2)

    In some neighborhood N of Ω, f is smooth and \(g(x)\) is Lipschitz continuous on an open convex set N that contains Ω such that there exists \(L>0\) (constant) satisfying

    $$ \bigl\Vert g(x)-g(y) \bigr\Vert \leq L \Vert x-y \Vert , \quad \forall x, y \in N. $$
    (2.3)

From Assumption 2.1 and [16, 28], it implies that there exist positive constants γ and b such that

$$\begin{aligned} \bigl\Vert g(x_{k}) \bigr\Vert \leq \gamma , \quad \forall x_{k}\in \Omega , \end{aligned}$$
(2.4)
$$\begin{aligned} \Vert x-y \Vert \leq b, \quad \forall x, y \in \Omega . \end{aligned}$$
(2.5)

But the function \(f(x)\) decreases as \(k \to +\infty \), hence, from Assumption 2.1, the sequence \(\{x_{k}\}\) generated by Algorithm 1 is said to be contained in a bounded region. This implies that the sequence \(\{x_{k}\}\) is bounded.

The convergence analysis of the new method would be studied in the next section.

Convergence analysis

In this section, we establish the sufficient descent condition and global convergence properties of the proposed TTRMIL+ method.

The following theorem indicates that the search direction of TTRMIL+ method satisfies the sufficient descent condition.

Theorem 3.1

Suppose that the sequence \(\{x_{k}\}\) is generated by Algorithm 1. The search direction \(d_{k}\) defined by (2.1) with \(\beta _{k}=\beta _{k}^{\mathrm{RMIL}+}\) (1.10) satisfies the sufficient descent condition (1.12).

Proof

We will prove by induction. For \(k=0\) and from (2.1), we have \(g_{0}^{T} d_{0}=-\|g_{0}\|^{2}\), so that the sufficient descent condition (1.12) is satisfied. Suppose that (1.12) is true for \(k-1\), that is, \(g_{k-1}^{T} d_{k-1}=-\|g_{k-1}\|^{2}\). According to the value of \(\beta _{k}^{\mathrm{RMIL}+}\) (1.10), we have two cases.

  • Case 1: \(\beta _{k}^{\mathrm{RMIL}+}=0\). Since (1.6), (2.1), (2.2), and \(g_{k}^{T}g_{k-1}>\|g_{k}\|^{2}\) hold, we have

    $$ \begin{aligned} g_{k}^{T} d_{k}&=-g_{k}^{T} g_{k}- \frac{g_{k}^{T} d_{k-1}}{d_{k-1}^{T} d_{k-1}} g_{k}^{T} y_{k-1} \\ &\leq - \Vert g_{k} \Vert ^{2}+\sigma \frac{ \Vert g_{k-1} \Vert ^{2}}{ \Vert d_{k-1} \Vert ^{2}}g_{k}^{T} y_{k-1} \\ &=- \Vert g_{k} \Vert ^{2}+\sigma \frac{ \Vert g_{k-1} \Vert ^{2}}{ \Vert d_{k-1} \Vert ^{2}} \bigl( \Vert g_{k} \Vert ^{2}-g_{k}^{T}g_{k-1} \bigr) \\ &\leq - \Vert g_{k} \Vert ^{2}. \end{aligned} $$
  • Case 2: \(\beta _{k}^{\mathrm{RMIL}+}= \frac{g_{k}^{T} y_{k-1}}{\|d_{k-1}\|^{2}}\). From (2.1) and (2.2), we get

    $$ g_{k}^{T} d_{k}=- \Vert g_{k} \Vert ^{2}+ \frac{g_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}g_{k}^{T} d_{k-1}- \frac{g_{k}^{T} d_{k-1}}{ \Vert d_{k-1} \Vert ^{2}} g_{k}^{T} y_{k-1}=- \Vert g_{k} \Vert ^{2}. $$

Hence, the search direction \(d_{k}\) defined by the TTRMIL+ method always satisfies the sufficient descent condition (1.12). □

Remark 3.2

Since the proposed method satisfies the sufficient descent condition (1.12), then, for all \(k\geq 0\), we have

$$ \Vert d_{k} \Vert \geq \Vert g_{k} \Vert . $$
(3.1)

Now, we will establish the global convergence of the TTRMIL+ method by first providing the following lemma to show that the standard Wolfe line search gives a lower bound for the step-size \(t_{k}\) as follows.

Lemma 3.3

Suppose that the sequence \(\{x_{k}\}\) is generated by Algorithm 1, where \(d_{k}\) is a descent direction and Assumption 2.1holds. If \(t_{k}\) is calculated by standard Wolfe line search (1.5) and (1.6), then we have

$$ t_{k}\geq \frac{(1-\sigma ) \Vert g_{k} \Vert ^{2}}{L \Vert d_{k} \Vert ^{2}}. $$
(3.2)

Proof

From the standard Wolfe condition (1.6) and by subtracting \(g_{k}^{T} d_{k}\) in the both sides, and using Lipschitz continuity (2.3), we get

$$ \begin{aligned} (\sigma -1)g_{k}^{T} d_{k} &\leq (g_{k+1}-g_{k})^{T} d_{k} \\ &\leq \Vert g_{k+1}-g_{k} \Vert \Vert d_{k} \Vert \\ &\leq L \Vert x_{k+1}-x_{k} \Vert \Vert d_{k} \Vert \\ &=L t_{k} \Vert d_{k} \Vert ^{2}.\end{aligned} $$

Since \(d_{k}\) is a descent direction and also \(\sigma <1\), that implies (3.2) is true. □

The following lemma is the Zoutendijk condition [29], which plays an important role in the analysis of the global convergence properties for CG method.

Lemma 3.4

Let Assumption 2.1hold and \(d_{k}\) be generated by (1.10), (2.1), and (2.2), where \(t_{k}\) is calculated by the standard Wolfe line search (1.5) and (1.6). Then

$$ \sum_{k=0}^{\infty } \frac{(g_{k}^{T} d_{k})^{2}}{ \Vert d_{k} \Vert ^{2}}< + \infty . $$
(3.3)

Proof

From the standard Wolfe condition (1.5) and (3.2), we have

$$\begin{aligned} f(x_{k} )-f(x_{k}+t_{k} d_{k} )\geq -\delta t_{k} g_{k}^{T} d_{k} \geq \delta \frac{(1-\sigma )(g_{k}^{T} d_{k})^{2}}{L \Vert d_{k} \Vert ^{2}}. \end{aligned}$$

Hence, from Assumption (2.1), we get the Zoutendijk condition (3.3) and hence the proof. □

We present a global convergence results of the proposed TTRMIL+ CG method using the standard Wolfe line search.

Theorem 3.5

Suppose that the sequence \(\{x_{k}\}\) is generated by Algorithm 1, we have

$$ \lim_{k \to \infty } \inf \Vert g_{k} \Vert =0. $$
(3.4)

Proof

Suppose by contradiction that (3.4) is not true. Then \(\forall k\geq 0\), we can find a positive constant c so that

$$ \Vert g_{k} \Vert \geq c. $$
(3.5)

Here, we have two cases.

  • Case 1: If \(\beta _{k}^{\mathrm{RMIL}+}=0\), then based on the Cauchy–Schwarz inequality and from (2.1), (2.2), (2.3), (2.4), (2.5), (3.1), and (3.5), we get

    $$\begin{aligned} \Vert d_{k} \Vert =& \Vert -g_{k}+\theta _{k} y_{k-1} \Vert \\ =& \biggl\Vert -g_{k}-\frac{g_{k}^{T} d_{k-1}}{d_{k-1}^{T} d_{k-1}} y_{k-1} \biggr\Vert \\ \leq & \Vert g_{k} \Vert + \frac{ \Vert g_{k} \Vert \Vert d_{k-1} \Vert \Vert y_{k-1} \Vert }{ \Vert d_{k-1} \Vert ^{2}} \\ \leq &\gamma +\frac{ \Vert g_{k} \Vert L \Vert x_{k}-x_{k-1} \Vert }{ \Vert d_{k-1} \Vert } \\ \leq &\gamma +\frac{ \Vert g_{k} \Vert L b}{ \Vert d_{k-1} \Vert }\\ \leq& \gamma + \frac{ \Vert g_{k} \Vert L b}{ \Vert g_{k-1} \Vert } \\ \leq &\gamma +\frac{\gamma L b}{c} \triangleq \nu . \end{aligned}$$
    (3.6)

    Furthermore, by using (1.12), (3.5), and (3.6), we obtain

    $$ \sum_{k=0}^{\infty }\frac{(g_{k}^{T} d_{k})^{2}}{ \Vert d_{k} \Vert ^{2}}\geq \sum _{k=0}^{\infty }\frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}}\geq \sum _{k=0}^{ \infty }\frac{c^{4}}{\nu ^{2}}=+\infty . $$

    This is a contradiction with (3.3). Hence, (3.4) holds.

  • Case 2: If \(\beta _{k}^{\mathrm{RMIL}+}=\beta _{k}^{\mathrm{RMIL}}\), then based on the Cauchy–Schwarz inequality and from (1.9), (2.1), (2.2), (2.3), (2.4), (2.5), and (3.1), we obtain

    $$\begin{aligned} \Vert d_{k} \Vert =& \bigl\Vert -g_{k}+\beta _{k}^{\mathrm{RMIL}}d_{k-1}+\theta _{k} y_{k-1} \bigr\Vert \\ \leq & \Vert g_{k} \Vert +\frac{ \vert g_{k}^{T} y_{k-1} \vert }{ \Vert d_{k-1} \Vert ^{2}} \Vert d_{k-1} \Vert + \biggl\Vert -\frac{g_{k}^{T} d_{k-1}}{d_{k-1}^{T} d_{k-1}} y_{k-1} \biggr\Vert \\ \leq & \Vert g_{k} \Vert + \frac{ \Vert g_{k} \Vert \Vert g_{k}-g_{k-1} \Vert \Vert d_{k-1} \Vert }{ \Vert d_{k-1} \Vert ^{2}}+ \frac{ \Vert g_{k} \Vert \Vert d_{k-1} \Vert \Vert g_{k}-g_{k-1} \Vert }{ \Vert d_{k-1} \Vert ^{2}} \\ \leq & \Vert g_{k} \Vert +2\frac{ \Vert g_{k} \Vert \Vert g_{k}-g_{k-1} \Vert }{ \Vert g_{k-1} \Vert } \\ \leq &\gamma +\frac{2\gamma L b}{c}\triangleq \zeta . \end{aligned}$$

    By using the same argument as in Case 1, we obtain (3.4) and the proof is complete. □

Numerical experiments

In this part, we report the numerical experiments to demonstrate the efficiency of the TTRMIL+ method in comparison with the RMIL [12], RMIL+ [13], PRP [7, 8], and TTRMIL [23] methods. For comparing the computational performance, we consider some test problems from Andrei [30], and Jamil and Yang [31]. Most of initial points are also considered by Andrei [30] and implemented using dimensions starting from 2 to 50,000. The test problems and their initial points are presented in Table 1. The codes were written in Matlab R2019a and run using a personal laptop with specification Intel Core i7 processor, 16 GB RAM, 64 bit Windows 10 Pro operating system. All algorithms are terminated when \(\|g_{k}\|\leq 10^{-6}\), and for objective comparison, all the methods are executed under the standard Wolfe line search (1.5) and (1.6) with parameter \(\delta =10^{-4}\), \(\sigma =0.8\) for the TTRMIL method, and \(\delta =0.01\), \(\sigma =0.1\) for the RMIL, RMIL+, PRP, and TTRMIL+ methods. The metrics used for comparison include the number of iterations (NOI), the number of function evaluations (NOF), and the central of processing unit (CPU) time.

Table 1 List of test problems, dimensions, and initial points

All numerical results of the RMIL, RMIL+, and PRP methods are listed in Table 2 and those of the TTRMIL and TTRMIL+ methods in Table 3. A method is said to have failed if the NOI is more than 10,000 and the terminating criteria stated above have not been satisfied. The failure is symbolized with ‘F’. We also use the performance profile tool of Dolan and Moré [32] to show the performance profile curve of RMIL, RMIL+, PRP, TTRMIL, and TTRMIL+. The performance profile figures on NOI, NOF, and CPU are presented in Figs. 1, 2, and 3, respectively.

Figure 1
figure 1

Performance profiles based on NOI

Figure 2
figure 2

Performance profiles based on NOF

Figure 3
figure 3

Performance profiles based on CPU time

Table 2 Numerical results of the RMIL, RMIL+, and PRP methods using weak Wolfe line search
Table 3 Numerical results of the TTRMIL and TTRMIL+ methods using weak Wolfe line search

Let P be the set of test problems with \(n_{p}\) being the number of test problem. S is the set of methods and \(n_{s}\) is the number of methods. For each method \(s \in S\) and problem \(p \in P\), let \(j_{p,s}\) denote either NOI, NOF, or CPU time required to solve problem p by method s. Then the performance profile is defined as follows:

$$ \rho _{s}(\tau )=\frac{1}{n_{p}}\mathit{size}\{p\in P:\log _{2} r_{p,s}\leq \tau \}, $$

where \(\tau >0\), and \(r_{p,s}\) is the performance ratio that can be obtained by

$$ r_{p,s}=\frac{j_{p,s}}{\min \{j_{p,s}\}}. $$

Generally, the method with the high performance profile value \(\rho _{s}(\tau )\) is considered the best method for a given τ value. In other words, the method where the curve dominates the very top is the most efficient method compared to the others.

According to Table 2, the RMIL method was able to solve 66% of the problems, RMIL+ 75%, and PRP 71%. Meanwhile, based on Table 3, the TTRMIL method solved 93% of the problems and the proposed TTRMIL+ 94%. In this regard, the TTRMIL+ method is considered a better method when compared to the RMIL, RMIL+, and PRP methods, but competes with the TTRMIL method in terms of NOI, CPU time, and NOF. From the performance profile in Figs. 13, we can see that the TTRMIL+ method is efficient and promising with regard to solving unconstrained optimization problems compared to the RMIL, RMIL+, PRP, and TTRMIL methods.

Application of TTRMIL+ to parameterized COVID-19 model

Coronavirus disease often called COVID-19 is an acute vector-borne disease that surfaced in 2019. This disease is caused by the newly discovered coronavirus (SARS-CoV-2) and can be transmitted through droplets produced when an infected person exhales, sneezes, or coughs. Most people infected by the virus will develop mild to moderate symptoms, such as mild fever, cold, difficulty in breathing, and recover without special treatment. Clinically, as of 3:05 pm CEST, 20 October 2020, a total of 40,251,950 confirmed cases of the COVID-19 with 1,116,131 deaths was recorded from 215 territories and countries around the globe since the disease was first reported in Wuhan, China [WHO].

Recently, numerous studies modeled various aspects of the coronavirus outbreak, and application of numerical methods on some COVID-19 models was also studied. In this paper, we consider the global COVID-19 outbreak from January to September, 2020, model the confirmed cases into an unconstrained optimization problem, and finally apply TTRMIL+ to obtain the solution of the parameterized model.

Consider the following function of regression analysis:

$$ y=h(x_{1}, x_{2}, \ldots , x_{p}+\varepsilon ), $$
(5.1)

where \(x_{i}\), \(i=1, 2, \ldots , p\), \(p>0\) is the predictor, y is the response variable, and ε is the error. This type of problem often arises in the fields of management, finance, economics, accounting, physics, and many more. The regression analysis is a statistical modeling tool used to estimate the relationships between a dependent variable and one or more independent variables. To derive the linear regression function, we compute y such that

$$ y=a_{0}+a_{1}x_{1}+a_{2}x_{2}+ \cdots +a_{p} x_{p} +\varepsilon , $$
(5.2)

where the parameters of the regression are defined by \(a_{0},\ldots ,a_{p}\). The regression analysis estimates the regression parameters \(a_{0},a_{1},\ldots ,a_{p}\) such that the value of the error ε is minimized. An instance where the linear regression method is the relationship between y and x is approximated with a straight line. However, such a case infrequently occurs, and thus, the nonlinear regression process is often used. In this study, we consider the nonlinear regression approach.

To formulate the approximate function, we consider the data from the global confirmed cases of COVID-19 from January to September, 2020. The detailed description of the process follows from the statistics presented in Table 4 which are taken from the data obtained by the World Health Organization [WHO] [33]. We have data for nine months (Jan–Sept), the data for the months would be denoted by x-variable and the confirmed cases corresponding to these months would be denoted by the y-variable. For fitting the data, we only consider the data for eight months (Jan–Aug), and reserve the data for September for error analysis.

Table 4 Statistics of confirmed cases of COVID-19, Jan–Sept, 2020

From the above data, we obtain the following approximate function for the nonlinear least square method:

$$ f(x)=-26{,}029.59+14{,}557.39x+3290.077x^{2}. $$
(5.3)

Function (5.3) is used to approximate the values of y based on values of x from Jan–Aug. Denoting the number of months by \(x_{j}\) and the corresponding confirmed cases by \(y_{j}\), then, we can transform the least squares problem (5.3) into the following unconstrained minimization model:

$$ \min_{x \in \mathbb{R}^{n}} f(x)=\sum _{j=1}^{n} \bigl( \bigl(u_{0}+u_{1} x_{j}+u_{2} x_{j}^{2} \bigr)-y_{j} \bigr)^{2}. $$
(5.4)

The nonlinear quadratic function for the least squares problem is derived using the data utilized from Jan–Aug, 2020, which is further used to formulate the corresponding unconstrained optimization model. Obviously, it can be observed that data \(x_{j}\) and the value of \(y_{j}\) possess some parabolic relations with the regression parameters \(u_{0}\), \(u_{1}\), and \(u_{2}\) and the regression function (5.4).

$$ \min_{x \in \mathbb{R}^{2}} \sum_{j=1}^{n} E_{j}^{2}=\sum_{j=1}^{n} \bigl( \bigl(u_{0}+u_{1} x_{j}+u_{2} x_{j}^{2} \bigr)-y_{j} \bigr)^{2}. $$
(5.5)

Using the data from Table 4, we can transform (5.5) to obtain our nonlinear quadratic unconstrained minimization model as follows:

$$ \begin{aligned} &9u_{0}^{2}+90u_{0} u_{1}+570u_{0} u_{2}-2{,}482{,}956u_{0}+285u_{1}^{2}+4050u_{1} u_{2}\\ &\quad {}-17{,}172{,}778u_{1}+15{,}333u_{2}^{2} \\ &\quad {}-126{,}050{,}318u_{2}+275{,}210{,}100{,}844. \end{aligned} $$
(5.6)

The data considered to generate the unconstrained optimization model are data from Jan–August, and the data for Sept is reserved for computing the relative errors of the predicted data. Applying the proposed TTRMIL+ method on model (5.6) under the strong Wolfe line search, we obtain the following results presented in Table 5.

Table 5 Test results for optimization of quadratic model for TTRMIL+

One of the major challenges is computing the values of \(u_{0}\), \(u_{1}\), \(u_{2}\) using matrix inverse [34]. To overcome this difficulty, we implement the proposed TTRMIL+ using different initial points. The computation would be terminated if the following conditions hold.

  1. 1.

    The algorithm fails to solve the model.

  2. 2.

    The number of iterations exceeds 1000. This point is denoted as ‘Fail’.

Trend line method

A trend line is a line drawn under pivot lows or over pivot highs to show the prevailing direction of price. In this section, we estimate the data for COVID-19 for a period of nine (9) months using the proposed TTRMIL+ and least squares methods. The trend line is plotted using the Microsoft Excel software based on data from Table 4. The trend line equation appears in a form of nonlinear quadratic equation. Representing the y-axis by y and x-axis by x, we obtain the plot presented in Fig. 4 using the actual data from Table 4. Further, to illustrate the efficiency of the proposed method, we compare the approximation functions of TTRMIL+ method with the functions of trend line and least square methods as follows.

Figure 4
figure 4

Nonlinear quadratic trend line for confirmed cases of COVID-19

The ideal purpose of regression analysis is estimating the parameters \(a_{0},a_{1},\ldots ,a_{p}\) such that the error ε is minimized. From the results presented in Table 6, it is obvious that the proposed TTRMIL+ CG method has the least relative error compared to the least square and trend line methods which implied that the method is applicable to real-life situations. For other references regarding modeling, analysis, and prediction of COVID-19 cases, one can see [35].

Table 6 Estimation point and relative errors for 2020 data

Application TTRMIL+ in motion control

This section demonstrates the performance of the proposed TTRMIL+ CG method on motion control of a two-joint planar robotic manipulator. As presented in [36], the following model describes a discrete-time kinematics equation of two-joint planar robot manipulator at the position level

$$ \Gamma (\mu _{k}) = \eta _{k}, $$
(6.1)

where \(\mu _{k}\in \mathbb{R}^{2}\) and \(\eta _{k}\in \mathbb{R}^{2}\) denote the joint angle vector and the end effector vector position, respectively. The vector-valued function \(\Gamma (\cdot )\) represents the kinematics function which has the following structure:

Γ( μ k )= [ τ 1 cos ( μ 1 ) + τ 2 cos ( μ 1 + μ 2 ) τ 1 sin ( μ 1 ) + τ 2 sin ( μ 1 + μ 2 ) ] ,
(6.2)

with \(\tau _{1}\) and \(\tau _{2}\) denoting the length of the first and second rod, respectively. In the case of motion control, at each instantaneous computational time interval \([t_{k}, t_{k+1})\subseteq [0, t_{f}]\) with \(t_{f}\) being the end of task duration, the following nonlinear least squares model is to be minimized:

$$ \min_{\Gamma _{k}\in \mathbb{R}^{2}}\frac{1}{2} \Vert \Gamma _{k}- \widehat{\Gamma }_{k} \Vert ^{2}, $$
(6.3)

where \(\widehat{\Gamma }_{k}\) denotes the end effector controlled track.

Similar to the approach presented in [3739], the end effector, used in this experiment, is controlled to track a Lissajous curve given as

Γ ˆ k = [ 3 2 + 1 5 sin ( π t k 5 ) 3 2 + 1 5 sin ( 2 π t k 5 + π 3 ) ] .
(6.4)

The parameters used in the implementation of the proposed TTRMIL+ CG method are: \(\tau _{1}=1\), \(\tau _{2}=1\), and \(t_{f}=10\) seconds. The starting point \(\mu _{0}=[\mu _{1}, \mu _{2}]=[0, \frac{\pi }{3}]^{T}\) where the task duration \([0, 10]\) is divided into 200 equal parts.

The results of the motion control experiments are depicted in Figs. 5(a)–5(b). The robot trajectories synthesized by the TTRMIL+ are shown in Fig. 5(a), where the end effector trajectory and the desired path are plotted in Fig. 5(b). Finally, the errors recorded on horizontal and vertical axes by the TTRMIL+ are shown in Figs. 5(c) and 5(d), respectively. Perusing through these figures, it can be seen from Figs. 5(a) and 5(b) that the TTRMIL+ successfully accomplished the task at hand. The error recorded in the course of the task is relatively low as can be seen from Figs. 5(c) and 5(d), which confirms the efficiency of the proposed TTRMIL+.

Figure 5
figure 5

Numerical results generated in the the course of robotic motion control experiment: (a) Robot trajectories synthesized by TTRMIL+. (b) End effector trajectory and desired path by TTRMIL+. (c) Residual error by TTRMIL+ on x-axis. (d) Residual error by TTRMIL+ on y-axis

Conclusion

This paper presented a modified conjugate gradient method for unconstrained optimization models. The proposed TTRMIL+ method replaced RMIL in TTRMIL with a new modification known as RMIL+. The sufficient descent condition and the convergence proof of TTRMIL+ are studied under the standard Wolfe line search. Some unconstrained benchmark test problems are considered to illustrate the performance of the proposed method. The result obtained showed that the TTRMIL+ method is efficient and promising. The method was further applied to a parameterized COVID-19 model, and the result obtained showed that TTRMIL+ produced a good regression model and thus can be used in regression analysis. Finally, we applied the method to solve a practical problem of motion control. Future work includes studying the new algorithm on nonlinear least squares problems as discussed in [40]. Furthermore, we shall consider other problems in our future research as presented in the following references [4144].

Availability of data and materials

Not applicable.

Change history

  • 18 January 2021

    The journal title in the pdf of the article has been updated.

Abbreviations

CG:

conjugate gradient

RMIL:

Rivaie, Mustafa, Ismail, and Leong

TTRMIL:

Three-term Rivaie, Mustafa, Ismail, and Leong

PRP:

Polak–Ribière–Polyak

NOI:

Number of iterations

NOF:

Number of function evaluations

CPU:

CPU time

SARS-CoV-2:

Severe acute respiratory syndrome coronavirus 2

COVID-19:

Coronavirus disease caused by SARS-CoV-2.

References

  1. 1.

    Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2015)

    Google Scholar 

  2. 2.

    Yuan, G., Lu, S., Wei, Z.: A new trust-region method with line search for solving symmetric nonlinear equations. Int. J. Comput. Math. 88(10), 2109–2123 (2011)

    MathSciNet  MATH  Google Scholar 

  3. 3.

    Sulaiman, I.M., Supian, S., Mamat, M.: New Class of Hybrid Conjugate Gradient Coefficients with Guaranteed Descent and Efficient Line Search. In IOP Conference Series: Materials Science and Engineering, vol. 621, p. 012021. IOP Publishing, Bristol (2019)

    Google Scholar 

  4. 4.

    Hager, W.W., Zhang, H.: A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2(1), 35–58 (2006)

    MathSciNet  MATH  Google Scholar 

  5. 5.

    Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49, 409–435 (1952)

    MathSciNet  MATH  Google Scholar 

  6. 6.

    Fletcher, R., Powell, M.J.D.: A rapidly convergent descent method for minimization. Comput. J. 6(2), 163–168 (1963)

    MathSciNet  MATH  Google Scholar 

  7. 7.

    Polak, E., Ribiere, G.: Note sur la convergence de méthodes de directions conjuguées. ESAIM: Math. Model. Numer. Anal. 3(R1), 35–43 (1969)

    MATH  Google Scholar 

  8. 8.

    Polyak, B.T.: The conjugate gradient method in extremal problems. USSR Comput. Math. Math. Phys. 9(4), 94–112 (1969)

    MATH  Google Scholar 

  9. 9.

    Liu, Y., Storey, C.: Efficient generalized conjugate gradient algorithms, part 1: theory. J. Optim. Theory Appl. 69(1), 129–137 (1991)

    MathSciNet  MATH  Google Scholar 

  10. 10.

    Dai, Y., Han, J., Liu, G., Sun, D., Yin, H., Yuan, Y.X.: Convergence properties of nonlinear conjugate gradient methods. SIAM J. Optim. 10(2), 345–358 (2000)

    MathSciNet  MATH  Google Scholar 

  11. 11.

    Yuan, G., Wei, Z., Lu, X.: Global convergence of BFGS and PRP methods under a modified weak Wolfe–Powell line search. Appl. Math. Model. 47, 811–825 (2017)

    MathSciNet  MATH  Google Scholar 

  12. 12.

    Rivaie, M., Mamat, M., June, L.W., Mohd, I.: A new class of nonlinear conjugate gradient coefficients with global convergence properties. Appl. Math. Comput. 218(22), 11323–11332 (2012)

    MathSciNet  MATH  Google Scholar 

  13. 13.

    Dai, Z.: Comments on a new class of nonlinear conjugate gradient coefficients with global convergence properties. Appl. Math. Comput. 276, 297–300 (2016)

    MathSciNet  MATH  Google Scholar 

  14. 14.

    Yousif, O.O.O.: The convergence properties of RMIL+ conjugate gradient method under the strong Wolfe line search. Appl. Math. Comput. 367, 124777 (2020)

    MathSciNet  MATH  Google Scholar 

  15. 15.

    Al-Baali, M.: Descent property and global convergence of the Fletcher–Reeves method with inexact line search. IMA J. Numer. Anal. 5(1), 121–124 (1985)

    MathSciNet  MATH  Google Scholar 

  16. 16.

    Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2(1), 21–42 (1992)

    MathSciNet  MATH  Google Scholar 

  17. 17.

    Touati-Ahmed, D., Storey, C.: Efficient hybrid conjugate gradient techniques. J. Optim. Theory Appl. 64(2), 379–397 (1990)

    MathSciNet  MATH  Google Scholar 

  18. 18.

    Hu, Y.F., Storey, C.: Global convergence result for conjugate gradient methods. J. Optim. Theory Appl. 71(2), 399–405 (1991)

    MathSciNet  MATH  Google Scholar 

  19. 19.

    Awwal, A.M., Sulaiman, I.M., Malik, M., Mamat, M., Kumam, P., Sitthithakerngkiet, K.: A spectral RMIL+ conjugate gradient method for unconstrained optimization with applications in portfolio selection and motion control. IEEE Access 9, 75398–75414 (2021)

    Google Scholar 

  20. 20.

    Beale, E.M.L.: A deviation of conjugate gradients. In: Numerical Methods for Nonlinear Optimization, pp. 39–43 (1972)

    Google Scholar 

  21. 21.

    McGuire, M.F., Wolfe, P.: Evaluating a restart procedure for conjugate gradients. IBM Thomas J. Watson Research Division (1973)

  22. 22.

    Zhang, L., Zhou, W., Li, D.H.: A descent modified Polak–Ribière–Polyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 26(4), 629–640 (2006)

    MathSciNet  MATH  Google Scholar 

  23. 23.

    Liu, J.K., Feng, Y.M., Zou, L.M.: Some three-term conjugate gradient methods with the inexact line search condition. Calcolo 55(2), 1–16 (2018)

    MathSciNet  MATH  Google Scholar 

  24. 24.

    Zhang, L., Zhou, W., Li, D.: Some descent three-term conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007)

    MathSciNet  MATH  Google Scholar 

  25. 25.

    Andrei, N.: A simple three-term conjugate gradient algorithm for unconstrained optimization. J. Comput. Appl. Math. 241, 19–29 (2013)

    MathSciNet  MATH  Google Scholar 

  26. 26.

    Al-Bayati, A.Y., Altae, H.W.: A new three-term non-linear conjugate gradient method for unconstrained optimization. Can. J. Sci. Eng. Math. Can. 1, 108–124 (2010)

    Google Scholar 

  27. 27.

    Dong, X., Liu, H., He, Y., Babaie-Kafaki, S., Ghanbari, R.: A new three–term conjugate gradient method with descent direction for unconstrained optimization. Math. Model. Anal. 21(3), 399–411 (2016)

    MathSciNet  MATH  Google Scholar 

  28. 28.

    Sun, M., Liu, J.: Three modified Polak–Ribiere–Polyak conjugate gradient methods with sufficient descent property. J. Inequal. Appl. 2015(1), 1 (2015)

    MathSciNet  MATH  Google Scholar 

  29. 29.

    Zoutendijk, G.: Nonlinear programming, computational methods. In: Integer and Nonlinear Programming, pp. 37–86 (1970)

    MATH  Google Scholar 

  30. 30.

    Andrei, N.: Nonlinear Conjugate Gradient Methods for Unconstrained Optimization. Springer, Berlin (2020)

    MATH  Google Scholar 

  31. 31.

    Jamil, M., Yang, X.S.: A literature survey of benchmark functions for global optimisation problems. Int. J. Math. Model. Numer. Optim. 4(2), 150–194 (2013)

    MATH  Google Scholar 

  32. 32.

    Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)

    MathSciNet  MATH  Google Scholar 

  33. 33.

    World heath organization: Report on coronavirus (COVID-19) (2020)

  34. 34.

    Sulaiman, I.M., Mamat, M.: A new conjugate gradient method with descent properties and its application to regression analysis. J. Numer. Anal. Ind. Appl. Math. 14(1–2), 25–39 (2020)

    MathSciNet  MATH  Google Scholar 

  35. 35.

    ul Rehman, A., Singh, R., Agarwal, P.: Modeling, analysis and prediction of new variants of COVID-19 and Dengue co-infection on complex network. Chaos Solitons Fractals 2021, 111008 (2021)

    MathSciNet  Google Scholar 

  36. 36.

    Zhang, Y., He, L., Hu, C., Guo, J., Li, J., Shi, Y.: General four-step discrete-time zeroing and derivative dynamics applied to time-varying nonlinear optimization. J. Comput. Appl. Math. 347, 314–329 (2019)

    MathSciNet  MATH  Google Scholar 

  37. 37.

    Awwal, A.M., Kumam, P., Wang, L., Huang, S., Kumam, W.: Inertial-based derivative-free method for system of monotone nonlinear equations and application. IEEE Access 8, 226921–226930 (2020)

    Google Scholar 

  38. 38.

    Yahaya, M.M., Kumam, P., Awwal, A.M., Aji, S.: A structured quasi–Newton algorithm with nonmonotone search strategy for structured NLS problems and its application in robotic motion control. J. Comput. Appl. Math. 395, 113582 (2021)

    MathSciNet  MATH  Google Scholar 

  39. 39.

    Aji, S., Kumam, P., Awwal, A.M., Yahaya, M.M., Kumam, W.: Two hybrid spectral methods with inertial effect for solving system of nonlinear monotone equations with application in robotics. IEEE Access 9, 30918–30928 (2021)

    MATH  Google Scholar 

  40. 40.

    Awwal, A.M., Kumam, P., Mohammad, H.: Iterative algorithm with structured diagonal Hessian approximation for solving nonlinear least squares problems. J. Nonlinear Convex Anal. 22(6), 1173–1188 (2021)

    MathSciNet  MATH  Google Scholar 

  41. 41.

    Agarwal, P., Ahsan, S., Akbar, M., Nawaz, R., Cesarano, C.: A reliable algorithm for solution of higher dimensional nonlinear \((1+ 1)\) and \((2+ 1)\) dimensional Volterra–Fredholm integral equations. Dolomites Res. Notes Approx. 14(2), 18–25 (2021)

    MathSciNet  Google Scholar 

  42. 42.

    Shah, N.A., Agarwal, P., Chung, J.D., El-Zahar, E.R., Hamed, Y.S.: Analysis of optical solitons for nonlinear Schrödinger equation with detuning term by iterative transform method. Symmetry 12(11), 1850 (2020)

    Google Scholar 

  43. 43.

    Saoudi, K., Agarwal, P., Mursaleen, M.: A multiplicity result for a singular problem with subcritical nonlinearities. J. Nonlinear Funct. Anal., 1–18 (2017)

  44. 44.

    Rahmoune, A., Ouchenane, D., Boulaaras, S., Agarwal, P.: Growth of solutions for a coupled nonlinear Klein–Gordon system with strong damping, source, and distributed delay terms. Adv. Differ. Equ. 2020(1), 1 (2020)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support provided by the Center of Excellence in Theoretical and Computational Science (TaCS-CoE), KMUTT and Center under Computational and Applied Science for Smart Innovation research Cluster (CLASSIC), Faculty of Science, KMUTT. Aliyu Muhammed Awwal would like to thank the Postdoctoral Fellowship from King Mongkut’s University of Technology Thonburi (KMUTT), Thailand.

Funding

This project is funded by National Council of Thailand (NRCT) under Research Grants for Talented Mid-Career Researchers (Contract no. N41A640089).

Author information

Affiliations

Authors

Contributions

The authors contributed equally to this paper. All authors have read and approved this version of the manuscript.

Corresponding author

Correspondence to Poom Kumam.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sulaiman, I.M., Malik, M., Awwal, A.M. et al. On three-term conjugate gradient method for optimization problems with applications on COVID-19 model and robotic motion control. Adv Cont Discr Mod 2022, 1 (2022). https://doi.org/10.1186/s13662-021-03638-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13662-021-03638-9

MSC

  • 90C30
  • 90C06
  • 90C56

Keywords

  • Finite difference
  • Three-term CG algorithms
  • Optimization models
  • Motion control
  • Line search procedure
  • Coronavirus (COVID-19)
  • Regression analysis