Theory and Modern Applications

# Novel forward–backward algorithms for optimization and applications to compressive sensing and image inpainting

## Abstract

The forward–backward algorithm is a splitting method for solving convex minimization problems of the sum of two objective functions. It has a great attention in optimization due to its broad application to many disciplines, such as image and signal processing, optimal control, regression, and classification problems. In this work, we aim to introduce new forward–backward algorithms for solving both unconstrained and constrained convex minimization problems by using linesearch technique. We discuss the convergence under mild conditions that do not depend on the Lipschitz continuity assumption of the gradient. Finally, we provide some applications to solving compressive sensing and image inpainting problems. Numerical results show that the proposed algorithm is more efficient than some algorithms in the literature. We also discuss the optimal choice of parameters in algorithms via numerical experiments.

## Introduction

In a real Hilbert space H, the unconstrained minimization problem of the sum of two convex functions is modeled in the following form:

$$\min_{x\in H} \bigl(f(x)+g(x)\bigr),$$
(1.1)

where $$f,g:H \to \mathbb{R}\cup \{+\infty \}$$ are proper lower semicontinuous convex functions. It is well known that (1.1) is equivalent to the problem of finding the zero of subdifferentials of $$f+g$$ at x. This problem is called the variational inclusion problem, see . We denote by $$\operatorname{argmin}(f+g)$$ the solution set of (1.1). If f is differentiable on H, then (1.1) can be described by the fixed point equation

$$x=\mathrm{prox}_{\alpha g}\bigl(x-\alpha \nabla f(x) \bigr),$$
(1.2)

where $$\alpha >0$$, and $$\mathrm{prox}_{g}$$ is the proximal operator of g defined by $$\mathrm{prox}_{g}=(\mathrm{Id}+\partial g)^{-1}$$, where Id denotes the identity operator in H, and ∂g is the subdifferential of g. In this connection, we can define a simple splitting method

$$x^{k+1}= \underbrace{\mathrm{prox}_{\alpha _{k} g}}_{ \text{backward step}} \underbrace{(\mathrm{Id}-\alpha _{k}\nabla f)}_{ \text{forward step}} \bigl(x^{k}\bigr),\quad k\geq 0,$$
(1.3)

where $$\alpha _{k}$$ is a suitable stepsize. This method is often called the forward–backward algorithm. Due to its simplicity and efficiency, there have been many modifications of (1.3) in the literature; see, for example, [1, 68, 12, 14]. The relaxed version of (1.3) was proposed by Combettes and Wajs  as follows.

### Algorithm 1.1

()

Given $$\varepsilon \in (0,\min \{1,\frac{1}{\alpha }\})$$, let $$x^{0}\in \mathbb{R}^{N}$$ and, for $$k\geq 1$$,

\begin{aligned}& y^{k} = x^{k}-\alpha _{k} \nabla f\bigl(x^{k}\bigr), \\& x^{k+1} = x^{k}+\lambda _{k}\bigl( \mathrm{prox}_{\alpha _{k} g}y^{k}-x^{k}\bigr), \end{aligned}
(1.4)

where $$\alpha _{k}\in [\varepsilon ,\frac{2}{\alpha }-\varepsilon ]$$, $$\lambda _{k}\in [\varepsilon ,1]$$, and α is the Lipschitz constant of the gradient of f.

Based on the fixed point concept, there have been many optimization algorithms and fixed point algorithms for solving such problems; see [13, 1517, 21]. In 2016, Cruz and Nghia  introduced a new forward–backward method using the linesearch technique. This method does not require the Lipschitz constant in computation.

### Algorithm 1.2

Given $$\sigma >0$$, $$\theta \in (0,1)$$, and $$\delta \in (0,\frac{1}{2})$$, let $$x^{0}\in \operatorname{dom}g$$ and, for $$k\geq 0$$, calculate

$$x^{k+1}=\mathrm{prox}_{\alpha _{k} g}\bigl(x^{k}-\alpha _{k}\nabla f\bigl(x^{k}\bigr)\bigr),$$

where $$\alpha _{k}=\sigma \theta ^{m_{k}}$$ with $$m_{k}$$ the smallest nonnegative integer satisfying the following linesearch rule:

$$\alpha _{k} \bigl\Vert \nabla f\bigl(x^{k+1}\bigr)- \nabla f\bigl(x^{k}\bigr) \bigr\Vert \leq \delta \bigl\Vert x^{k+1}-x^{k} \bigr\Vert .$$

It was proved that the sequence $$(x^{k})$$ converges weakly to a minimizer of $$f+g$$ under suitable conditions.

In practical applications, many problems in real world such as image inpainting can be modeled as a subproblem. To investigate them, we suggest a projected forward–backward algorithm for solving the constrained convex minimization problem modeled as follows:

$$\min_{x\in \Omega } \bigl(f(x)+g(x)\bigr),$$
(1.5)

where Ω is a nonempty closed convex subset of H, f and g are convex functions on H, and f is differentiable on H.

In variational theory, Tseng  introduced the modified forward–backward splitting algorithms for finding zeros of the sum of two monotone operators. Let $$X\subset \operatorname{dom}A$$ be a closed convex set.

### Algorithm 1.3

Given $$x^{0}\in \operatorname{dom}A$$ and $$\alpha _{k}\in (0,+\infty )$$, calculate

\begin{aligned}& y^{k} = (\mathrm{Id}+\alpha _{k}B)^{-1}( \mathrm{Id}-\alpha _{k}A) \bigl(x^{k}\bigr), \\& x^{k+1} = P_{X}\bigl[y^{k}-\alpha _{k}\bigl(A\bigl(y^{k}\bigr)-A\bigl(x^{k} \bigr)\bigr)\bigr], \end{aligned}
(1.6)

where A is L-Lipschitz continuous on $$X\cup \operatorname{dom}B$$, and $$\alpha _{k}\in (0,1/L)$$. It was proved that $$(x^{k})$$ converges weakly to zeros of $$A+B$$ that are also contained in X.

Most of the work related to two convex minimization problems usually assume the Lipschitz condition on the gradient of f. This restriction can be relaxed by using a linesearch technique. So we suggest new forward–backward algorithms to solve the unconstrained and constrained convex minimization problems, which are based on a new linesearch technique . Then we prove weak convergence theorems for the proposed algorithm. As applications, we apply our main results to solving compressed sensing and image inpainting problems. Then we compare the performance of our algorithms with Algorithms 1.1 and 1.2. Moreover, we discuss numerical results of the comparative analysis to show the optimal choice of parameters.

The content is organized as follows. In Sect. 2, we recall some the useful concepts. In Sect. 3, we establish the main theorem on our algorithms. In Sect. 4, we give numerical experiments to support the convergence of our algorithms. Finally, in Sect. 5, we end this paper by conclusions.

## Preliminaries

In this section, we give some definitions and lemmas that play an essential role in our analysis. Let H be a real Hilbert space equipped with inner product $$\langle \cdot ,\cdot \rangle$$ and norm $$\|\cdot \|$$. Let $$h:H\rightarrow \bar{\mathbb{R}}$$ be a proper lower semicontinuous convex function. We use the following notations:

• denotes the weak convergence.

• $$\operatorname{dom}h:=\{x\in H|h(x) < +\infty \}$$ denotes the domain of h.

• $$\operatorname{Gph}(A)\in H\times H=\{(x,y):y\in Ax\}$$, where $$A:H\to 2^{H}$$ is a multivalued operator, denotes the graph of A.

• $$\omega _{w}(x^{k})=\{x:\exists (x^{k_{n}})\subset (x^{k})\text{ such that } x^{k_{n}}\rightharpoonup x\}$$ denotes the set of all weak limit points.

• $$F(T)=\{x\in C:x=Tx\}$$ denotes the set of fixed points of $$T:C\to C$$.

We recall the following definitions:

1. (1)

A mapping $$T:H\to H$$ is said to be nonexpansive if, for all $$x,y\in H$$,

$$\Vert Tx-Ty \Vert \leq \Vert x-y \Vert .$$
2. (2)

A mapping $$T:H\to H$$ is said to be firmly nonexpansive if, for all $$x,y\in H$$,

$$\Vert Tx-Ty \Vert ^{2}\leq \langle x-y,Tx-Ty\rangle .$$
3. (3)

A mapping $$T:H\to H$$ is said to be monotone if, for all $$x,y\in H$$,

$$\langle x-y,Tx-Ty\rangle \geq 0.$$
4. (4)

An operator $$A:H\to 2^{H}$$ is said to be maximal monotone if there is no monotone operator $$B:H\to 2^{H}$$ such that $$\operatorname{Gph}(B)$$ properly contains $$\operatorname{Gph}(A)$$, that is, for every $$(x,u)\in H\times H$$,

$$(x,u)\in \operatorname{Gph}(A)\quad \Longleftrightarrow \quad \langle x-y,Ax-Ay\rangle \geq 0$$

for all $$(y,v)\in \operatorname{Gph}(A)$$.

5. (5)

A function $$h:H\to \mathbb{R}$$ is said to be convex if

$$h\bigl(\lambda x+(1-\lambda )y\bigr)\leq h(x)+(1-\lambda )h(y)$$

for all $$\lambda \in (0,1)$$ and $$x,y\in H$$.

6. (6)

A differentiable function h is convex if and only if

$$h(x)+\bigl\langle \nabla h(x), y-x\bigr\rangle \leq h(y)$$

for all $$y\in H$$.

7. (7)

An element $$g\in H$$ is said to be a subgradient of $$h : H\rightarrow \mathbb{R}$$ at x if

$$h(x)+\langle g, y-x\rangle \leq h(y)$$

for all $$y\in H$$.

8. (8)

The subdifferential of h at x is defined by

$$\partial h(x)=\bigl\{ v \in H:\langle v,y-x\rangle +h(x)\leq h(y), y\in H \bigr\} .$$
9. (9)

A function $$f : H\rightarrow \mathbb{R}$$ is said to be weakly lower semicontinuous at x if $$x_{n}\rightharpoonup x$$ implies

$$f(x)\leq \liminf_{n\rightarrow \infty } f(x_{n}).$$
10. (10)

A projection of x onto a nonempty, closed and convex subset C of H is defined by

$$P_{C} x:= \operatorname*{\operatorname{argmin}}_{y\in C} \Vert x - y \Vert ^{2} .$$
11. (11)

The proximal operator $$\mathrm{prox}_{g} : H \rightarrow H$$ of g is defined by

$$\mathrm{prox}_{g}(z) = (\mathrm{Id} + \partial g)^{-1}(z),\quad z \in H.$$

We know that proximal operator is single-valued with full domain. Moreover, from  we have

$$\frac{z-\mathrm{prox}_{\alpha g}(z)}{\alpha }\in \partial g\bigl( \mathrm{prox}_{\alpha g}(z) \bigr)\quad \text{for all } z\in H, \alpha >0.$$
(2.1)

### Lemma 2.1

()

Let C be a nonempty closed convex subset of a real Hilbert space H. Then for any $$x \in H$$, the following statements hold:

1. (i)

$$\langle x-P_{C}x, y-P_{C}x\rangle \leq 0$$ for all $$y\in C$$;

2. (ii)

$$\|P_{C}x-P_{C}y\|^{2}\leq \langle P_{C}x-P_{C}y, x-y\rangle$$ for all $$x, y\in H$$;

3. (iii)

$$\|P_{C}x-y\|^{2}\leq \|x-y\|^{2}-\|P_{C}x-x\|^{2}$$ for all $$y\in C$$.

### Lemma 2.2

()

The subdifferential operator ∂h of a convex function h is maximal monotone. Moreover, the graph of $$\partial h, \operatorname{Gph}(\partial h)=\{(x,v)\in H\times H:v\in \partial h(x)\}$$, is demiclosed, that is, if a sequence $$(x^{k},v^{k})\subset \operatorname{Gph}(\partial h)$$ is such that $$(x^{k})$$ converges weakly to x and $$(v^{k})$$ converges strongly to v, then $$(x,v)\in \operatorname{Gph}(\partial h)$$.

### Lemma 2.3

()

Let H be a real Hilbert space. Let C be a nonempty closed convex subset of H, and let $$T:C\rightarrow C$$ be a nonexpansive mapping such that $$F(T)\neq \emptyset$$. If $$(x^{k})\subset C$$, $$x^{k}\rightharpoonup z$$, and $$\|Tx^{k}-x^{k}\|\rightarrow 0$$, then $$Tz=z$$.

### Lemma 2.4

()

Let H be a real Hilbert space. Let S be a nonempty closed convex subset of H, and let $$(x^{k})$$ be a sequence in H satisfying:

1. (i)

$$\lim_{k\rightarrow \infty }\|x^{k}-x\|$$ exists for each $$x \in S$$;

2. (ii)

$$\omega _{w}(x^{k})\subset S$$.

Then $$(x^{k})$$ weakly converges to an element of S.

## Main results

In this section, we assume that the set $$S_{*}$$ of all solutions of problem (1.1) is nonempty. We propose new algorithms by combining a new linesearch technique and prove weak convergence theorems. We assume that

1. (1)

$$f , g : H \rightarrow \mathbb{R}\cup \{+\infty \}$$ are proper lower semicontinuous convex functions, f is differentiable on H and

2. (2)

the gradient f is uniformly continuous and bounded on bounded subsets of H.

Note that the latter condition holds if f is Lipschitz continuous on H.

### Algorithm 3.1

Given $$\sigma >0$$, $$\theta \in (0,1)$$, $$\gamma \in (0,2)$$, and $$\delta \in (0,\frac{1}{6})$$. Let $$x^{0}\in H$$.

Step 1. Calculate

$$y^{k}=\mathrm{prox}_{\alpha _{k}g} \bigl(x^{k}-\alpha _{k}\nabla f\bigl(x^{k} \bigr)\bigr)$$

and

$$z^{k}=\mathrm{prox}_{\alpha _{k}g} \bigl(y^{k}-\alpha _{k}\nabla f\bigl(y^{k} \bigr)\bigr),$$

where $$\alpha _{k}=\sigma \theta ^{m_{k}}$$ with $$m_{k}$$ the smallest nonnegative integer such that

$$\alpha _{k}\cdot \max \bigl\{ \bigl\Vert \nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr) \bigr\Vert , \bigl\Vert \nabla f\bigl(z^{k}\bigr)- \nabla f\bigl(y^{k} \bigr) \bigr\Vert \bigr\} \leq \delta \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr).$$
(3.1)

Step 2. Calculate

$$x^{k+1}=x^{k}-\gamma \eta _{k}d_{k},$$

where

$$d_{k}=x^{k}-z^{k}- \alpha _{k}\bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(z^{k}\bigr)\bigr) \quad \mbox{and} \quad \eta _{k}= \frac{(\frac{1}{2}-3\delta )( \Vert x^{k}-y^{k} \Vert ^{2}+ \Vert z^{k}-y^{k} \Vert ^{2})}{ \Vert d_{k} \Vert ^{2}}.$$

Set $$k:=k+1$$, and go to Step 1.

### Remark 3.2

For variational inequality problem, this kind of method is firstly appeared in Noor [18, 19, 22]

### Lemma 3.3

()

Linesearch (3.1) stops after finitely many steps.

### Theorem 3.4

Let $$(x^{k} )$$ and $$(\alpha _{k} )$$ be generated by Algorithm 3.1. Assume that there is $$\alpha > 0$$ such that $$\alpha _{k}\geq \alpha > 0$$ for all $$k \in \mathbb{N}$$. Then $$(x^{k} )$$ weakly converges to an element of $$S_{*}$$.

### Proof

Let $$x_{*}$$ be a solution in $$S_{*}$$. Then we obtain

\begin{aligned} \bigl\Vert x^{k+1}-x_{*} \bigr\Vert ^{2} =& \bigl\Vert x^{k}-\gamma \eta _{k}d_{k}-x_{*} \bigr\Vert ^{2} \\ =& \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\bigl\langle x^{k}-x_{*},d_{k} \bigr\rangle +\gamma ^{2}\eta _{k}^{2} \Vert d_{k} \Vert ^{2}. \end{aligned}
(3.2)

Since $$y^{k}=\mathrm{prox}_{\alpha _{k}g}(x^{k}-\alpha _{k}\nabla f(x^{k}))$$, we have $$(\mathrm{Id}-\alpha _{k}\nabla f)(x^{k})\in (\mathrm{Id}+\alpha _{k}\partial g)(y^{k})$$. Moreover, ∂g is maximal monotone, so there is $$u^{k}\in \partial g(y^{k})$$ such that

$$(\mathrm{Id}-\alpha _{k}\nabla f) \bigl(x^{k} \bigr)=y^{k}+\alpha _{k}u^{k}.$$

So we have

$$u^{k}=\frac{1}{\alpha _{k}}\bigl(x^{k}-y^{k}- \alpha _{k}\nabla f\bigl(x^{k}\bigr)\bigr).$$
(3.3)

Note that $$0\in \nabla f(x_{*})+\partial g(x_{*})\subseteq \partial (f+g)(x_{*})$$ and $$\nabla f(y^{k})+u^{k}\in \partial (f+g)y^{k}$$. Therefore we obtain

$$\bigl\langle \nabla f\bigl(y^{k} \bigr)+u^{k},y^{k}-x_{*}\bigr\rangle \geq 0.$$
(3.4)

Using (3.3) and (3.4), we have

$$\frac{1}{\alpha _{k}}\bigl\langle x^{k}-y^{k}-\alpha _{k}\nabla f \bigl(x^{k}\bigr)+ \alpha _{k}\nabla f \bigl(y^{k}\bigr),y^{k}-x_{*}\bigr\rangle \geq 0.$$

It follows that

$$\bigl\langle x^{k}-y^{k}-\alpha _{k}\bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr)\bigr),y^{k}-x_{*} \bigr\rangle \geq 0.$$
(3.5)

From $$z^{k}=\mathrm{prox}_{\alpha _{k}g}(y^{k}-\alpha _{k}\nabla f(y^{k}))$$ we get $$(\mathrm{Id}-\alpha _{k}\nabla f)(y^{k})\in (\mathrm{Id}+\alpha _{k}\partial g)(z^{k})$$. Since ∂g is maximal monotone, there is $$v^{k}\in \partial g(z^{k})$$ such that

$$(\mathrm{Id}-\alpha _{k}\nabla f) \bigl(y^{k} \bigr)=z^{k}+\alpha _{k}v^{k}.$$

This shows that

$$v^{k}=\frac{1}{\alpha _{k}}\bigl(y^{k}-z^{k}- \alpha _{k}\nabla f\bigl(y^{k}\bigr)\bigr).$$
(3.6)

Similarly to $$y^{k}$$, we can show that

$$\bigl\langle y^{k}-z^{k}-\alpha _{k}\bigl(\nabla f\bigl(y^{k}\bigr)-\nabla f \bigl(z^{k}\bigr)\bigr),z^{k}-x_{*} \bigr\rangle \geq 0.$$
(3.7)

Combining (3.5) and (3.7), we have

\begin{aligned} 0 \leq &\bigl\langle x^{k}-y^{k}-\alpha _{k}\bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr)\bigr),y^{k}-x_{*} \bigr\rangle +\bigl\langle y^{k}-z^{k}-\alpha _{k} \bigl(\nabla f\bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr),z^{k}-x_{*} \bigr\rangle \\ =&\bigl\langle x^{k}-y^{k}-\alpha _{k} \bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr) \bigr),y^{k}-z^{k} \bigr\rangle +\bigl\langle x^{k}-y^{k}-\alpha _{k}\bigl(\nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr) \bigr),z^{k}-x_{*} \bigr\rangle \\ &{}+\bigl\langle y^{k}-z^{k}-\alpha _{k} \bigl(\nabla f\bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr),z^{k}-x_{*} \bigr\rangle \\ =&\bigl\langle x^{k}-y^{k}-\alpha _{k} \bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr) \bigr),y^{k}-z^{k} \bigr\rangle \\ &{}+\bigl\langle x^{k}-z^{k}-\alpha _{k}\bigl(\nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr),z^{k}-x_{*} \bigr\rangle . \end{aligned}
(3.8)

We consider

\begin{aligned}& \bigl\langle x^{k}-y^{k}-\alpha _{k}\bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr)\bigr),y^{k}-z^{k} \bigr\rangle \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl\langle \nabla f\bigl(y^{k} \bigr)-\nabla f\bigl(x^{k}\bigr),y^{k}-z^{k} \bigr\rangle \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle +\bigl\langle \nabla f\bigl(z^{k}\bigr)- \nabla f \bigl(x^{k}\bigr),y^{k}-z^{k}\bigr\rangle \bigr] \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle \\& \qquad {}+\bigl\langle \nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle +\bigl\langle \nabla f\bigl(x^{k}\bigr),z^{k}-y^{k} \bigr\rangle \bigr] \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle +\bigl\langle \nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle \\& \qquad {} +\bigl\langle \nabla f\bigl(x^{k}\bigr),z^{k}-x^{k} \bigr\rangle +\bigl\langle \nabla f\bigl(x^{k}\bigr),x^{k}-y^{k} \bigr\rangle \bigr] \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle \\& \qquad {}+\bigl\langle \nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle +\bigl\langle \nabla f\bigl(x^{k}\bigr),z^{k}-x^{k} \bigr\rangle \\& \qquad {} +\bigl\langle \nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr),x^{k}-y^{k}\bigr\rangle + \bigl\langle \nabla f\bigl(y^{k}\bigr),x^{k}-y^{k} \bigr\rangle \bigr]. \end{aligned}

By the convexity of f we have

\begin{aligned}& \bigl\langle x^{k}-y^{k}-\alpha _{k}\bigl( \nabla f\bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr) \bigr),y^{k}-z^{k} \bigr\rangle \\& \quad \leq \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle +f\bigl(y^{k}\bigr)-f\bigl(z^{k}\bigr)+f \bigl(z^{k}\bigr)-f\bigl(x^{k}\bigr) \\& \qquad {} +\bigl\langle \nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr),x^{k}-y^{k}\bigr\rangle +f \bigl(x^{k}\bigr)-f\bigl(y^{k}\bigr)\bigr] \\& \quad = \bigl\langle x^{k}-y^{k},y^{k}-z^{k} \bigr\rangle +\alpha _{k}\bigl[\bigl\langle \nabla f \bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr),y^{k}-z^{k} \bigr\rangle \\& \qquad {}+\bigl\langle \nabla f\bigl(x^{k}\bigr)- \nabla f \bigl(y^{k}\bigr),x^{k}-y^{k}\bigr\rangle \bigr]. \end{aligned}
(3.9)

Using $$2\langle x^{k}-y^{k},y^{k}-z^{k}\rangle =\|x^{k}-z^{k}\|^{2}-\|x^{k}-y^{k} \|^{2}-\|y^{k}-z^{k}\|^{2}$$, (3.1), (3.8), and (3.9), we see that

\begin{aligned}& -\bigl\langle x^{k}-z^{k}-\alpha _{k}\bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(z^{k}\bigr)\bigr),z^{k}-x_{*} \bigr\rangle \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr]+\alpha _{k}\bigl[\bigl\langle \nabla f\bigl(y^{k}\bigr)-\nabla f\bigl(z^{k} \bigr),y^{k}-z^{k} \bigr\rangle \\& \qquad {} +\bigl\langle \nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr),x^{k}-y^{k}\bigr\rangle \bigr] \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr]+\alpha _{k}\bigl[ \bigl\Vert \nabla f\bigl(y^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr\Vert \bigl\Vert y^{k}-z^{k} \bigr\Vert \\& \qquad {} + \bigl\Vert \nabla f\bigl(x^{k}\bigr)-\nabla f \bigl(y^{k}\bigr) \bigr\Vert \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigr] \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr]+\delta \bigl[\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr) \bigl\Vert y^{k}-z^{k} \bigr\Vert \\& \qquad {} +\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr) \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigr] \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr]+\delta \bigl[ \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigl\Vert y^{k}-z^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2} \\& \qquad {} + \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigr] \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr] \\& \qquad {}+\delta \bigl[2 \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigl\Vert y^{k}-z^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}\bigr] \\& \quad \leq \frac{1}{2}\bigl[ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}- \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr]+2\delta \bigl[ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}\bigr] \\& \quad \leq \frac{1}{2} \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}-\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}. \end{aligned}

So we have

$$\bigl\langle d_{k},z^{k}-x_{*} \bigr\rangle \geq -\frac{1}{2} \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}.$$
(3.10)

Using the definition of $$d_{k}$$ and Linesearch (3.1), we have

\begin{aligned}& \bigl\langle d_{k}, x^{k}-x_{*} \bigr\rangle \\& \quad = \bigl\langle x^{k}-z^{k}-\alpha _{k} \bigl(\nabla f\bigl(x^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr),x^{k}-z^{k} \bigr\rangle +\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad = \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\alpha _{k}\bigl\langle x^{k}-z^{k}, \nabla f\bigl(x^{k}\bigr)- \nabla f\bigl(z^{k}\bigr)\bigr\rangle +\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad = \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\alpha _{k}\bigl\langle x^{k}-z^{k}, \nabla f\bigl(x^{k}\bigr)- \nabla f\bigl(y^{k}\bigr)\bigr\rangle -\alpha _{k}\bigl\langle x^{k}-z^{k}, \nabla f\bigl(y^{k}\bigr)- \nabla f\bigl(z^{k}\bigr)\bigr\rangle \\& \qquad {}+\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad \geq \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\alpha _{k} \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert \nabla f\bigl(x^{k}\bigr)- \nabla f \bigl(y^{k}\bigr) \bigr\Vert -\alpha _{k} \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert \nabla f \bigl(y^{k}\bigr)- \nabla f\bigl(z^{k}\bigr) \bigr\Vert \\& \qquad {} +\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad \geq \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\delta \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr) \\& \qquad {}-\delta \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr) \\& \qquad {} +\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad = \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\delta \bigl( \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr) \\& \qquad {} -\delta \bigl( \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr)+\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad = \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\delta \bigl(2 \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert x^{k}-y^{k} \bigr\Vert +2 \bigl\Vert x^{k}-z^{k} \bigr\Vert \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr)+\bigl\langle d_{k},z^{k}-x_{*}\bigr\rangle \\& \quad \geq \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\delta \bigl( \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+ \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr)+\bigl\langle d_{k},z^{k}-x_{*} \bigr\rangle \\& \quad = (1-2\delta ) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\delta \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr)+\bigl\langle d_{k},z^{k}-x_{*} \bigr\rangle . \end{aligned}
(3.11)

From (3.10) and (3.11) we have

\begin{aligned} \bigl\langle d_{k}, x^{k}-x_{*} \bigr\rangle \geq &(1-2\delta ) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}- \delta \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr)-\frac{1}{2} \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2} \\ &{}+\biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2} \\ =& \biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+\biggl( \frac{1}{2}-3\delta \biggr) \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr). \end{aligned}
(3.12)

Since $$\eta _{k}= \frac{(\frac{1}{2}-3\delta )(\|x^{k}-y^{k}\|^{2}+\|z^{k}-y^{k}\|^{2})}{\|d_{k}\|^{2}}$$, we have $$\eta _{k}\|d_{k}\|^{2}=(\frac{1}{2}-3\delta )(\|x^{k}-y^{k}\|^{2}+\|z^{k}-y^{k} \|^{2})$$. So

$$\bigl\langle d_{k}, x^{k}-x_{*} \bigr\rangle \geq \biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+\eta _{k} \Vert d_{k} \Vert ^{2}.$$
(3.13)

This gives

$$-2\gamma \eta _{k}\bigl\langle d_{k}, x^{k}-x_{*}\bigr\rangle \leq -2\gamma \eta _{k}\biggl(\frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-2\gamma \eta _{k}^{2} \Vert d_{k} \Vert ^{2}.$$
(3.14)

Therefore from (3.2) and the above we obtain

\begin{aligned} \bigl\Vert x^{k+1}-x_{*} \bigr\Vert ^{2} \leq & \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-2\gamma \eta _{k}^{2} \Vert d_{k} \Vert ^{2}+\gamma ^{2}\eta _{k}^{2} \Vert d_{k} \Vert ^{2} \\ =& \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\biggl(\frac{1}{2}-2 \delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\frac{2-\gamma }{\gamma } \Vert \gamma \eta _{k}d_{k} \Vert ^{2}. \end{aligned}
(3.15)

By the monotonicity of f we get

\begin{aligned} \Vert d_{k} \Vert ^{2} =& \bigl\Vert x^{k}-z^{k}-\alpha _{k}\bigl(\nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(z^{k}\bigr)\bigr) \bigr\Vert ^{2} \\ =& \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+\alpha _{k}^{2} \bigl\Vert \nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr\Vert ^{2}-2\alpha _{k}\bigl\langle x^{k}-z^{k}, \nabla f\bigl(x^{k}\bigr)-\nabla f\bigl(z^{k}\bigr) \bigr\rangle \\ \leq & \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+\alpha _{k}^{2} \bigl\Vert \nabla f \bigl(x^{k}\bigr)-\nabla f\bigl(y^{k}\bigr)+ \nabla f \bigl(y^{k}\bigr)\nabla -f\bigl(z^{k}\bigr) \bigr\Vert ^{2} \\ \leq & \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+2\alpha _{k}^{2}\bigl[ \bigl\Vert \nabla f\bigl(x^{k}\bigr)- \nabla f\bigl(y^{k}\bigr) \bigr\Vert ^{2}+ \bigl\Vert \nabla f\bigl(y^{k}\bigr) \nabla -f\bigl(z^{k}\bigr) \bigr\Vert ^{2}\bigr] \\ \leq & \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+2\alpha _{k}^{2}\bigl[\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr)^{2}+\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \bigr)^{2}\bigr] \\ \leq & \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}+4\alpha _{k}^{2}\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+2 \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigl\Vert z^{k}-y^{k} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr) \\ \leq & \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+2 \bigl\Vert x^{k}-y^{k} \bigr\Vert \bigl\Vert y^{k}-z^{k} \bigr\Vert + \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}+8\alpha _{k}^{2}\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr) \\ \leq & 2\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr)+8\alpha _{k}^{2}\bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr) \\ =&\bigl(2+8\delta ^{2}\bigr) \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert y^{k}-z^{k} \bigr\Vert ^{2}\bigr) \end{aligned}
(3.16)

and, equivalently,

$$\frac{1}{ \Vert d_{k} \Vert ^{2}} \geq \frac{1}{(2+8\delta ^{2})( \Vert x^{k}-y^{k} \Vert ^{2}+ \Vert y^{k}-z^{k} \Vert ^{2})}.$$
(3.17)

Therefore we have

$$\eta _{k}= \frac{(\frac{1}{2}-3\delta )( \Vert x^{k}-y^{k} \Vert ^{2}+ \Vert z^{k}-y^{k} \Vert ^{2})}{ \Vert d_{k} \Vert ^{2}} \geq \frac{(\frac{1}{2}-3\delta )}{(2+8\delta ^{2})}>0.$$

On the other hand, we have

$$\eta _{k} \Vert d_{k} \Vert ^{2}=\biggl(\frac{1}{2}-3\delta \biggr) \bigl( \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2}\bigr).$$
(3.18)

Thus it follows that

\begin{aligned} \bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2} =&\frac{1}{(\frac{1}{2}-3\delta )} \eta _{k} \Vert d_{k} \Vert ^{2} \\ =&\frac{1}{(\frac{1}{2}-3\delta )(\gamma ^{2}\eta _{k})} \Vert \gamma \eta _{k}d_{k} \Vert ^{2}. \end{aligned}
(3.19)

From (3.18) and (3.19) we get

$$\bigl\Vert x^{k}-y^{k} \bigr\Vert ^{2}+ \bigl\Vert z^{k}-y^{k} \bigr\Vert ^{2} \leq \frac{(2+8\delta ^{2})}{(\frac{1}{2}-3\delta )} \Vert \gamma \eta _{k}d_{k} \Vert ^{2}.$$

Since $$x^{k+1}=x^{k}-\gamma \eta _{k}d_{k}$$, it follows that $$\gamma \eta _{k}d_{k}=x^{k}-x^{k+1}$$. This implies that

$$\bigl\Vert x^{k+1}-x_{*} \bigr\Vert ^{2} \leq \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2}-\frac{2-\gamma }{\gamma } \bigl\Vert x^{k}-x^{k+1} \bigr\Vert ^{2}.$$
(3.20)

Thus $$\lim_{k\rightarrow \infty }\|x^{k}-x_{*}\|$$ exists, and $$(x^{k})$$ is bounded. Note that by (3.20)

$$\frac{2-\gamma }{\gamma } \bigl\Vert x^{k}-x^{k+1} \bigr\Vert ^{2}+2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert x^{k}-z^{k} \bigr\Vert ^{2} \leq \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}- \bigl\Vert x^{k+1}-x_{*} \bigr\Vert ^{2}.$$

Hence $$\|x^{k}-z^{k}\|\rightarrow 0$$ and $$\|x^{k+1}-x^{k}\|\rightarrow 0$$ as $$k\rightarrow \infty$$. It follows that $$\|x^{k}-y^{k}\|\rightarrow 0$$ and $$\|y^{k}-z^{k}\|\rightarrow 0$$ as $$k\rightarrow \infty$$. By the boundedness of $$(x^{k} )$$ we know that the set of its weak limit points is nonempty. Let $$x^{\infty }\in \omega _{w}(x^{k})$$. Then there is a subsequence $$(x^{k_{n}})$$ of $$(x^{k})$$ such that $$x^{k_{n}}\rightharpoonup x^{\infty }$$. Next, we show that $$x^{\infty }\in S_{*}$$. Let $$(v,u)\in \operatorname{Gph}(\nabla f+\partial g)$$, that is, $$u-\nabla f(v)\in \partial g(v)$$. Since $$y^{k_{n}}=(\mathrm{Id}+\alpha _{k_{n}}\partial g)^{-1}(\mathrm{Id}-\alpha _{k_{n}} \nabla f)x^{k_{n}}$$, we have

$$(\mathrm{Id}-\alpha _{k_{n}}\nabla f)x^{k_{n}} \in (\mathrm{Id}+\alpha _{k_{n}}\partial g)y^{k_{n}},$$

which gives

$$\frac{1}{\alpha _{k_{n}}}\bigl(x^{k_{n}}-y^{k_{n}}- \alpha _{k_{n}}\nabla f\bigl(x^{k_{n}}\bigr)\bigr) \in \partial g \bigl(y^{k_{n}}\bigr).$$

Since ∂g is maximal monotone, it follows that

$$\biggl\langle v-y^{k_{n}},u-\nabla f(v)-\frac{1}{\alpha _{k_{n}}} \bigl(x^{k_{n}}-y^{k_{n}}- \alpha _{k_{n}}\nabla f \bigl(x^{k_{n}}\bigr)\bigr)\biggr\rangle \geq 0.$$

This shows that

\begin{aligned} \bigl\langle v-y^{k_{n}},u\bigr\rangle \geq &\biggl\langle v-y^{k_{n}},\nabla f(v)+ \frac{1}{\alpha _{k_{n}}} \bigl(x^{k_{n}}-y^{k_{n}}-\alpha _{k_{n}}\nabla f \bigl(x^{k_{n}}\bigr)\bigr) \biggr\rangle \\ =&\bigl\langle v-y^{k_{n}},\nabla f(v)-\nabla f\bigl(x^{k_{n}} \bigr)\bigr\rangle + \biggl\langle v-y^{k_{n}},\frac{1}{\alpha _{k_{n}}} \bigl(x^{k_{n}}-y^{k_{n}}\bigr) \biggr\rangle \\ =&\bigl\langle v-y^{k_{n}},\nabla f(v)-\nabla f\bigl(y^{k_{n}} \bigr)\bigr\rangle + \bigl\langle v-y^{k_{n}},\nabla f \bigl(y^{k_{n}}\bigr)-\nabla f\bigl(x^{k_{n}}\bigr)\bigr\rangle \\ &{}+\biggl\langle v-y^{k_{n}},\frac{1}{\alpha _{k_{n}}}\bigl(x^{k_{n}}-y^{k_{n}} \bigr) \biggr\rangle \\ \geq &\bigl\langle v-y^{k_{n}},\nabla f\bigl(y^{k_{n}}\bigr)- \nabla f\bigl(x^{k_{n}}\bigr) \bigr\rangle +\biggl\langle v-y^{k_{n}},\frac{1}{\alpha _{k_{n}}}\bigl(x^{k_{n}}-y^{k_{n}} \bigr) \biggr\rangle . \end{aligned}
(3.21)

Since $$\lim_{k\rightarrow \infty }\|x^{k}-y^{k}\|=0$$, by the assumption we have $$\lim_{k\rightarrow \infty }\|\nabla f(x^{k})-\nabla f(y^{k}) \|=0$$. Taking the limit as $$n\to \infty$$ in (3.21), we have

$$\bigl\langle v-x^{\infty },u\bigr\rangle \geq 0.$$

Thus $$0\in (\nabla f+\partial g)x^{\infty }$$, and consequently $$x^{\infty }\in S_{*}$$. By Lemma 2.4 we conclude that $$(x^{k})$$ converges weakly to an element of $$S_{*}$$. Thus we complete the proof. □

### Remark 3.5

If f is L-Lipschitz continuous, then the condition on $$\alpha _{k}$$ in Theorem 3.4 can be removed since $$\alpha _{k}\geq \min \{\sigma ,\delta \theta /L\}>0$$; see .

Next, we introduce a new projected forward–backward algorithm and the convergence analysis. We denote by $$\Omega \cap \operatorname{argmin}(f+g)$$ the solution set of (1.5). Assume that this solution set is nonempty.

### Algorithm 3.6

Given $$\sigma >0$$, $$\theta \in (0,1)$$, $$\gamma \in (0,2)$$, and $$\delta \in (0,\frac{1}{6})$$. Let $$w^{0}\in H$$.

Step 1. Calculate

$$x^{k}=\mathrm{prox}_{\alpha _{k}g} \bigl(w^{k}-\alpha _{k}\nabla f\bigl(w^{k} \bigr)\bigr)$$

and

$$y^{k}=\mathrm{prox}_{\alpha _{k}g} \bigl(x^{k}-\alpha _{k}\nabla f\bigl(x^{k} \bigr)\bigr),$$

where $$\alpha _{k}=\sigma \theta ^{m_{k}}$$ with $$m_{k}$$ the smallest nonnegative integer such that

$$\alpha _{k}\cdot \max \bigl\{ \bigl\Vert \nabla f \bigl(w^{k}\bigr)-\nabla f\bigl(x^{k}\bigr) \bigr\Vert , \bigl\Vert \nabla f\bigl(y^{k}\bigr)- \nabla f\bigl(x^{k} \bigr) \bigr\Vert \bigr\} \leq \delta \bigl( \bigl\Vert w^{k}-x^{k} \bigr\Vert + \bigl\Vert y^{k}-x^{k} \bigr\Vert \bigr).$$
(3.22)

Step 2. Calculate

$$z^{k}=w^{k}-\gamma \eta _{k}d_{k},$$

where

$$d_{k}=w^{k}-y^{k}- \alpha _{k}\bigl(\nabla f\bigl(w^{k}\bigr)-\nabla f \bigl(y^{k}\bigr)\bigr) \quad \mbox{and} \quad \eta _{k}= \frac{(\frac{1}{2}-3\delta )( \Vert w^{k}-x^{k} \Vert ^{2}+ \Vert y^{k}-x^{k} \Vert ^{2})}{ \Vert d_{k} \Vert ^{2}}.$$

Step 3. Calculate

$$w^{k+1}=P_{\Omega } \bigl(z^{k}\bigr).$$

Set $$k:=k+1$$, and go to Step 1.

### Theorem 3.7

Let $$(x^{k} )$$ and $$(\alpha _{k} )$$ be generated by Algorithm 3.6. Assume that there is $$\alpha > 0$$ such that $$\alpha _{k}\geq \alpha > 0$$ for all $$k \in \mathbb{N}$$. Then $$(x^{k} )$$ weakly converges to an element of $$\Omega \cap \operatorname{argmin}(f+g)$$.

### Proof

Let $$w_{*}$$ be a solution in $$\Omega \cap \operatorname{argmin}(f+g)$$. Then using Lemma 2.1(ii), we have

\begin{aligned} \bigl\Vert w^{k+1}-w_{*} \bigr\Vert ^{2} =& \bigl\Vert P_{\Omega }\bigl(z^{k} \bigr)-w_{*} \bigr\Vert ^{2} \\ \leq & \bigl\Vert z^{k}-w_{*} \bigr\Vert ^{2}- \bigl\Vert P_{\Omega }\bigl(z^{k} \bigr)-z^{k} \bigr\Vert ^{2}. \end{aligned}
(3.23)

Since $$z^{k}=w^{k}-\gamma \eta _{k}d_{k}$$, we have $$\gamma \eta _{k}d_{k}=w^{k}-z^{k}$$. Similarly to Theorem 3.4, we can show that

$$\bigl\Vert z^{k}-w_{*} \bigr\Vert ^{2} \leq \bigl\Vert w^{k}-w_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert w^{k}-y^{k} \bigr\Vert ^{2}-\frac{2-\gamma }{\gamma } \bigl\Vert w^{k}-z^{k} \bigr\Vert ^{2}.$$
(3.24)

From (3.23) and (3.24) we obtain

\begin{aligned} \bigl\Vert w^{k+1}-w_{*} \bigr\Vert ^{2} \leq & \bigl\Vert w^{k}-w_{*} \bigr\Vert ^{2}-2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert w^{k}-y^{k} \bigr\Vert ^{2}-\frac{2-\gamma }{\gamma } \bigl\Vert w^{k}-z^{k} \bigr\Vert ^{2} \\ &{}- \bigl\Vert P_{\Omega }\bigl(z^{k}\bigr)-z^{k} \bigr\Vert ^{2}. \end{aligned}
(3.25)

Thus $$\lim_{k\rightarrow \infty }\|w^{k}-w_{*}\|$$ exists, and $$(w^{k})$$ is bounded. From (3.25) we see that

\begin{aligned}& 2\gamma \eta _{k}\biggl( \frac{1}{2}-2\delta \biggr) \bigl\Vert w^{k}-y^{k} \bigr\Vert ^{2}+ \frac{2-\gamma }{\gamma } \bigl\Vert w^{k}-z^{k} \bigr\Vert ^{2}+ \bigl\Vert P_{\Omega }\bigl(z^{k}\bigr)-z^{k} \bigr\Vert ^{2} \\& \quad \leq \bigl\Vert w^{k}-w_{*} \bigr\Vert ^{2}- \bigl\Vert w^{k+1}-w_{*} \bigr\Vert ^{2}. \end{aligned}

Thus $$\|w^{k}-y^{k}\|\rightarrow 0$$, $$\|w^{k}-z^{k}\|\rightarrow 0$$, and $$\|P_{\Omega }(z^{k})-z^{k}\|\rightarrow 0$$ as $$k\rightarrow \infty$$. Also, we can show that $$\|w^{k}-x^{k}\|\rightarrow 0$$ and $$\|y^{k}-x^{k}\|\rightarrow 0$$ as $$k\rightarrow \infty$$. Let $$w^{\infty }\in \omega _{w}(w_{*})$$. As in Theorem 3.7, we can show that $$w^{\infty }\in \operatorname{argmin}(f+g)$$. On the other hand, since $$\lim_{k\rightarrow \infty }\|P_{\Omega }(z^{k})-z^{k}\|=0$$ and $$z^{k}\rightharpoonup w^{\infty }$$, by Lemma 2.3 we have $$w^{\infty }\in \Omega$$. Therefore $$w^{\infty }\in \Omega \cap \operatorname{argmin}(f+g)$$. Using Lemma 2.4, we can conclude that Theorem 3.7 holds. □

## Numerical experiments

In this section, we apply our result to the signal recovery in compressive sensing and image inpainting. We compare the performance of our algorithms with those of Combettes and Wajs  and Cruz and Nghia .

The numerical experiments are performed by Matlab 2020b on a 64-bit MacBook Pro Chip Apple M1 and 8 GB of RAM.

We consider the following LASSO problem:

$$\min_{x\in \mathbb{R}^{N}}\biggl(\frac{1}{2} \Vert Ax-y \Vert _{2}^{2}+ \lambda \Vert x \Vert _{1}\biggr),$$
(4.1)

where $$A:\mathbb{R}^{N}\rightarrow \mathbb{R}^{M} (M< N)$$ is a bounded linear operator, $$y\in \mathbb{R}^{M}$$ is the observed data, and $$\lambda >0$$. Rewriting (4.1) as problem (1.1), we can set

$$f(x)=\frac{1}{2} \Vert y-Ax \Vert _{2}^{2},\qquad g(x)=\lambda \Vert x \Vert _{1}.$$

In experiment, y is generated by the Gaussian noise with $$\mathrm{SNR}=40$$, A is generated by normal distribution with mean zero and variance one, and $$x\in \mathbb{R}^{N}$$ is generated by uniform distribution in [–2,2] that contains m nonzero components. The stopping criterion is defined by

$$\mathrm{MSE}=\frac{1}{N} \bigl\Vert x^{k}-x_{*} \bigr\Vert ^{2}< 10^{-4},$$

where $$x^{k}$$ is an estimated signal of $$x_{*}$$.

The initial point $$x^{0}$$ is chosen to be zero. Let $$\alpha =\frac{1}{\|A\|^{2}}$$ and $$\lambda _{k}=0.82$$ in Algorithm 1.1. Let $$\sigma =7$$, $$\delta =0.02$$, $$\theta =0.15$$, and $$\gamma =1.85$$ in Algorithms 1.2 and 3.1, respectively. We now present the corresponding numerical results (the number of iterations is denoted by Iter, and CPU denotes the time of CPU) using different numbers of inequality constraints m. The numerical results are shown in Table 1.

From Table 1 we see that the experiment result of Algorithm 3.1 is better than those of Algorithms 1.1 and 1.2 in terms of CPU time and number of iterations in all cases.

Next, we provide Fig. 1 to show the convergence of each algorithm via the graph of the MSE value and number of iterations and Fig. 2 to show signal recovery in compressed sensing when $$N=1024$$, $$M=512$$, and $$m=70$$.

Next, we analyze the convergence and the effects of the stepsizes depending on the parameters σ, δ, θ, and γ in Algorithm 3.1.

In the first experiment, we investigate the effect of the parameter γ in the proposed algorithm. We intend to vary this parameter and study its convergence behavior. The numerical results are shown in Table 2.

From Table 2 we see that the CPU time and the number of iterations of Algorithm 3.1 decrease when the parameter γ approaches 2. We show numerical results for each case of γ in Fig. 3.

In the second experiment, we compare the performance of Algorithm 3.1 with different parameters θ in Theorem 3.4. Numerical results are shown in Table 3.

From Table 3 we observe that the CPU time of Algorithm 3.1 increases, but the number of iterations decreases when the parameter θ approaches 1. Figure 4 shows numerical results for each θ.

Next, we compare the performance of Algorithm 3.1 with different parameters σ in Theorem 3.4. Numerical results are reported in Table 4.

From Table 4 we observe that CPU increases when σ increases. However, there is no effect in terms of iterations.

Similarly, we obtain numerical results of Algorithm 3.1 with different δ in Table 5.

From Table 5 we see that the parameter δ has no effect in terms of the number of iterations and CPU time for both cases.

Next, we aim to apply our result for solving an image inpainting problem described by the following mathematical model:

$$\min_{x\in \mathbb{R}^{M\times N}}\frac{1}{2} \bigl\Vert A(x-x_{0}) \bigr\Vert _{F}^{2}+\mu \Vert x \Vert _{*},$$
(4.2)

where $$x_{0}\in \mathbb{R}^{M\times N} (M< N)$$ is a matrix with entries that lie in the interval $$[l,u]$$, A is a linear map that selects a subset of the entries of an $$M\times N$$ matrix by setting each unknown entry in the matrix to 0, x is matrix of known entries $$A(x_{0})$$, and $$\mu >0$$ is a regularization parameter.

In particular, we consider the following image inpainting problem [10, 11]:

$$\min_{x}\frac{1}{2} \bigl\Vert P_{\Omega }(x)-P_{\Omega }(x_{0})) \bigr\Vert _{F}^{2}+ \mu \Vert x \Vert _{*},$$
(4.3)

where $$\|\cdot \|_{F}$$ is the Frobenius norm, and $$\|\cdot \|_{*}$$ is the nuclear norm. Here we define $$P_{\Omega }$$ by

$$P_{\Omega }(x)= \textstyle\begin{cases} x_{ij},& (i,j)\in \Omega , \\ 0, & \mbox{otherwise}. \end{cases}$$
(4.4)

The nuclear norm has been widely used in image inpainting and matrix completion problem, which is a convex relaxation of low rank constraint. It is obvious that the optimization problem (4.3) is related to (1.5). Indeed fact, let $$f(x)=\frac{1}{2}\|P_{\Omega }(x)-P_{\Omega }(x_{0})\|_{F}^{2}$$ and $$g(x)=\mu \|x\|_{*}$$. Then $$\nabla f(x)=P_{\Omega }(x)-P_{\Omega }(x_{0})$$ is 1-Lipschitz continuous. The proximity operator of $$g(x)$$ can be computed by the singular value decomposition (SVD) .

To evaluate the quality of the restored images, we use the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM)  defined by

$$\mathrm{PSNR}=20\log \frac{ \Vert x \Vert _{F}}{ \Vert x-x_{r} \Vert _{F}}$$
(4.5)

and

$$\mathrm{SSIM}= \frac{(2u_{x}u_{x_{r}}+c_{1})(2\sigma _{xx_{r}}+c_{2})}{(u_{x}^{2}+u_{x_{r}}^{2}+c_{1})(\sigma _{x}^{2}+\sigma _{x_{r}}^{2}+c_{2})},$$
(4.6)

where x is the original image, $$x_{r}$$ is the restored image, $$u_{x}$$ and $$u_{x_{r}}$$ are the mean values of the original image x and restored image $$x_{r}$$, respectively, $$\sigma _{x}^{2}$$ and $$\sigma _{x_{r}}^{2}$$ are the variances, $$\sigma _{xx_{r}}^{2}$$ is the covariance of two images, $$c_{1} = (K_{1}L)^{2}$$ and $$c_{2} = (K_{2}L)^{2}$$ with $$K_{1} = 0.01$$ and $$K_{2} = 0.03$$, and L is the dynamic range of pixel values. SSIM ranges from 0 to 1, with 1 meaning perfect recovery. The initial point $$x^{0}$$ is chosen to be zero. Let $$\alpha =\frac{1}{\|A\|^{2}}$$ and $$\lambda _{k}=0.82$$ in Algorithm 1.1. Let $$\sigma =7$$, $$\delta =0.02$$, $$\theta =0.15$$, and $$\gamma =1.85$$ in Algorithms 1.2 and 3.1, respectively. We obtain the following results.

From Table 6 we see that the experiment results of Algorithm 3.6 are better than those of Algorithms 1.1 and 1.2 in terms of PSNR and SSIM in all cases.

The original images are given in Fig. 5. The figures of inpainting images for the 250th and 350th iterations are shown in Figs. 67. The PSNR values and iterations are plotted in Fig. 8.

Next, we analyze the convergence and the effects of the stepsizes depending on the parameters σ, γ, θ, and δ in Algorithm 3.6.

First, we study the effect of the parameter σ in the proposed algorithm. The numerical results are shown in Table 7.

From Table 7 we observe that the PSNR and the SSIM of Algorithm 3.6 increase when the parameter σ increases. Figure 9 shows numerical results for various σ.

Next, we investigate the effect of the parameter γ in the proposed algorithm. We intend to vary this parameter and study its convergence behavior. The numerical results are shown in Table 8.

From Table 8 we observe that the PSNR and the SSIM of Algorithm 3.6 increase when the parameter γ approaches 2. Moreover, we see that CPU time decreases when the parameter γ approaches 2. Figure 10 shows numerical results for various γ.

Next, we study the effect of the parameter θ. The numerical results are shown in Table 9.

From Table 8 we observe that the PSNR, SSIM, and CPU time of Algorithm 3.6 increase when the parameter θ approaches 1. Figure 11 shows numerical results for various θ.

We next study the effect of the parameter δ. The results are shown in Table 10.

From Table 10 we observe that the PSNR and SSIM of Algorithm 3.6 increase when the parameter δ approaches $$1/6$$. Moreover, we see that CPU time increases when the parameter δ approaches 0. Figure 12 shows numerical results for each δ.

## Conclusion

In this work, we proposed new forward–backward algorithms for solving convex minimization problem. We proved weak convergence theorems under some weakened assumptions on the stepsize. Our algorithms do not require the Lipschitz constant of the gradient of functions. Moreover, we proposed a new projected forward–backward splitting algorithm using new linesearch to solve constrained convex minimization problem. As a result, it can be applied effectively to solve signal recovery and image inpainting. Our algorithms have a good performance in terms of iterations and CPU times. We also have discussed the effects of all parameters in our algorithms.

## Availability of data and materials

Contact the author for data requests.

## References

1. Bauschke, H.H., Borwein, J.M.: Dykstra’ s alternating projection algorithm for two sets. J. Approx. Theory 79(3), 418–443 (1994)

2. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, vol. 408. Springer, New York (2011)

3. Bello Cruz, J.Y., Nghia, T.T.: On the convergence of the forward–backward splitting method with linesearches. Optim. Methods Softw. 31(6), 1209–1238 (2016)

4. Burachik, R.S., Iusem, A.N.: Set-Valued Mappings and Enlargements of Monotone Operators, vol. 8. Springer, Berlin (2007)

5. Cai, J.F., Cands, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)

6. Chen, G.H., Rockafellar, R.T.: Convergence rates in forward–backward splitting. SIAM J. Optim. 7(2), 421–444 (1997)

7. Cholamjiak, W., Cholamjiak, P., Suantai, S.: An inertial forward–backward splitting method for solving inclusion problems in Hilbert spaces. J. Fixed Point Theory Appl. 20(1), 42 (2018)

8. Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp. 185–212. Springer, New York (2011)

9. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)

10. Cui, F., Tang, Y., Yang, Y.: An inertial three-operator splitting algorithm with applications to image inpainting. arXiv preprint (2019). arXiv:1904.11684

11. Davis, D., Yin, W.: A three-operator splitting scheme and its optimization applications. Set-Valued Var. Anal. 25(4), 829–858 (2017)

12. Dong, Q., Jiang, D., Cholamjiak, P., Shehu, Y.: A strong convergence result involving an inertial forward–backward algorithm for monotone inclusions. J. Fixed Point Theory Appl. 19(4), 3097–3118 (2017)

13. Hussain, A., Ali, D., Karapinar, E.: Stability data dependency and errors estimation for a general iteration method. Alex. Eng. J. 60(1), 703–710 (2021)

14. Kankam, K., Pholasa, N., Cholamjiak, P.: Hybrid forward–backward algorithms using linesearch rule for minimization problem. Thai J. Math. 17(3), 607–625 (2019)

15. Li, L., Wu, D.: The convergence of Ishikawa iteration for generalized Φ-contractive mappings. Results Nonlinear Anal. 4(1), 47–56 (2021)

16. Marino, G., Scardamaglia, B., Karapinar, E.: Strong convergence theorem for strict pseudo-contractions in Hilbert spaces. J. Inequal. Appl. 2016(1), 134 (2016)

17. Noor, M.A.: New approximation schemes for general variational inequalities. J. Math. Anal. Appl. 251(1), 217–229 (2000)

18. Noor, M.A.: Some developments in general variational inequalities. Appl. Math. Comput. 152(1), 199–277 (2004)

19. Noor, M.A., Noor, K.I., Rassias, M.T.: New trends in general variational inequalities. Acta Appl. Math. 170(1), 981–1064 (2020)

20. Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73(4), 591–597 (1967)

21. Rus, I.A.: Some problems in the fixed point theory. Adv. Theory Nonlinear Anal. Appl. 2, 1–10 (2018)

22. Suantai, S., Kesornprom, S., Pholasa, N., Cho, Y.J., Cholamjiak, P.: A relaxed projection method using a new linesearch for the split feasibility problem. AIMS Math. 6(3), 2690–2703 (2021)

23. Tseng, P.: A modified forward–backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2), 431–446 (2000)

24. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

## Acknowledgements

The authors would like to thank editor and reviewers for value comments. The authors thank Chiang Mai University and University of Phayao for their supports.

## Funding

This work was supported by Thailand Science Research and Innovation under the project IRN62W0007, the revenue budget in 2021, School of Science, University of Phayao, and Thailand Research Fund under project RSA6180084.

## Author information

Authors

### Contributions

All the authors contributed equally to this manuscript. All authors read and approved the final manuscript.

### Corresponding author

Correspondence to Prasit Cholamjiak.

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests.

## Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions 