Novel forward–backward algorithms for optimization and applications to compressive sensing and image inpainting

The forward–backward algorithm is a splitting method for solving convex minimization problems of the sum of two objective functions. It has a great attention in optimization due to its broad application to many disciplines, such as image and signal processing, optimal control, regression, and classification problems. In this work, we aim to introduce new forward–backward algorithms for solving both unconstrained and constrained convex minimization problems by using linesearch technique. We discuss the convergence under mild conditions that do not depend on the Lipschitz continuity assumption of the gradient. Finally, we provide some applications to solving compressive sensing and image inpainting problems. Numerical results show that the proposed algorithm is more efficient than some algorithms in the literature. We also discuss the optimal choice of parameters in algorithms via numerical experiments.


Introduction
In a real Hilbert space H, the unconstrained minimization problem of the sum of two convex functions is modeled in the following form: where f , g : H → R ∪ {+∞} are proper lower semicontinuous convex functions. It is well known that (1.1) is equivalent to the problem of finding the zero of subdifferentials of f + g at x. This problem is called the variational inclusion problem, see [19]. We denote by argmin(f + g) the solution set of (1.1). If f is differentiable on H, then (1.1) can be described by the fixed point equation where α > 0, and prox g is the proximal operator of g defined by prox g = (Id + ∂g) -1 , where Id denotes the identity operator in H, and ∂g is the subdifferential of g. In this connection, we can define a simple splitting method where α k is a suitable stepsize. This method is often called the forward-backward algorithm. Due to its simplicity and efficiency, there have been many modifications of (1.3) in the literature; see, for example, [1, 6-8, 12, 14]. The relaxed version of (1.3) was proposed by Combettes and Wajs [9] as follows.
Based on the fixed point concept, there have been many optimization algorithms and fixed point algorithms for solving such problems; see [13,[15][16][17]21]. In 2016, Cruz and Nghia [3] introduced a new forward-backward method using the linesearch technique. This method does not require the Lipschitz constant in computation. Algorithm 1.2 Given σ > 0, θ ∈ (0, 1), and δ ∈ (0, 1 2 ), let x 0 ∈ dom g and, for k ≥ 0, calculate where α k = σ θ m k with m k the smallest nonnegative integer satisfying the following linesearch rule: It was proved that the sequence (x k ) converges weakly to a minimizer of f + g under suitable conditions.
In practical applications, many problems in real world such as image inpainting can be modeled as a subproblem. To investigate them, we suggest a projected forward-backward algorithm for solving the constrained convex minimization problem modeled as follows: , (1.5) where is a nonempty closed convex subset of H, f and g are convex functions on H, and f is differentiable on H.
In variational theory, Tseng [23] introduced the modified forward-backward splitting algorithms for finding zeros of the sum of two monotone operators. Let X ⊂ dom A be a closed convex set. Algorithm 1.3 Given x 0 ∈ dom A and α k ∈ (0, +∞), calculate where A is L-Lipschitz continuous on X ∪ dom B, and α k ∈ (0, 1/L). It was proved that (x k ) converges weakly to zeros of A + B that are also contained in X.
Most of the work related to two convex minimization problems usually assume the Lipschitz condition on the gradient of f . This restriction can be relaxed by using a linesearch technique. So we suggest new forward-backward algorithms to solve the unconstrained and constrained convex minimization problems, which are based on a new linesearch technique [14]. Then we prove weak convergence theorems for the proposed algorithm. As applications, we apply our main results to solving compressed sensing and image inpainting problems. Then we compare the performance of our algorithms with Algorithms 1.1 and 1.2. Moreover, we discuss numerical results of the comparative analysis to show the optimal choice of parameters.
The content is organized as follows. In Sect. 2, we recall some the useful concepts. In Sect. 3, we establish the main theorem on our algorithms. In Sect. 4, we give numerical experiments to support the convergence of our algorithms. Finally, in Sect. 5, we end this paper by conclusions.

Preliminaries
In this section, we give some definitions and lemmas that play an essential role in our analysis. Let H be a real Hilbert space equipped with inner product ·, · and norm · . Let h : H →R be a proper lower semicontinuous convex function. We use the following notations: • denotes the weak convergence.
x} denotes the set of all weak limit points. • F(T) = {x ∈ C : x = Tx} denotes the set of fixed points of T : C → C. We recall the following definitions: (1) A mapping T : H → H is said to be nonexpansive if, for all x, y ∈ H, for all λ ∈ (0, 1) and x, y ∈ H.
for all y ∈ H. (8) The subdifferential of h at x is defined by  (11) The proximal operator prox g : H → H of g is defined by We know that proximal operator is single-valued with full domain. Moreover, from [3] we have

Lemma 2.2 ([4])
The subdifferential operator ∂h of a convex function h is maximal monotone. Moreover, the graph of ∂h, Then (x k ) weakly converges to an element of S.

Main results
In this section, we assume that the set S * of all solutions of problem (1.1) is nonempty. We propose new algorithms by combining a new linesearch technique and prove weak convergence theorems. We assume that (1) f , g : H → R ∪ {+∞} are proper lower semicontinuous convex functions, f is differentiable on H and (2) the gradient ∇f is uniformly continuous and bounded on bounded subsets of H. Note that the latter condition holds if ∇f is Lipschitz continuous on H. Algorithm 3.1 Given σ > 0, θ ∈ (0, 1), γ ∈ (0, 2), and δ ∈ (0, 1 6 ). Let x 0 ∈ H. Step 1. Calculate where α k = σ θ m k with m k the smallest nonnegative integer such that Step 2. Calculate Set k := k + 1, and go to Step 1.
Then (x k ) weakly converges to an element of S * .
Proof Let x * be a solution in S * . Then we obtain So we have It follows that Similarly to y k , we can show that Combining (3.5) and (3.7), we have We consider By the convexity of f we have (3.8), and (3.9), we see that So we have Using the definition of d k and Linesearch (3.1), we have From (3.10) and (3.11) we have This gives (3.14) Therefore from (3.2) and the above we obtain By the monotonicity of ∇f we get x ky k 2 + 2 x ky k z ky k + z ky k 2 ≤ x ky k 2 + 2 x ky k y kz k + y kz k 2 + 8α 2 k x ky k 2 + z ky k 2 ≤ 2 x ky k 2 + y kz k 2 + 8α 2 k x ky k 2 + z ky k 2 = 2 + 8δ 2 x ky k 2 + y kz k 2 (3.16) and, equivalently, . (3.17) Therefore we have On the other hand, we have Thus it follows that From (3.18) and (3.19) we get Since x k+1 = x kγ η k d k , it follows that γ η k d k = x kx k+1 . This implies that Thus lim k→∞ x kx * exists, and (x k ) is bounded. Note that by (3.20) Hence x kz k → 0 and x k+1x k → 0 as k → ∞. It follows that x ky k → 0 and y kz k → 0 as k → ∞. By the boundedness of (x k ) we know that the set of its weak limit points is nonempty. Let x ∞ ∈ ω w (x k ). Then there is a subsequence (x k n ) of (x k ) such that x k n x ∞ . Next, we show that x ∞ ∈ S * . Let (v, u) ∈ Gph(∇f + ∂g), that is, u -∇f (v) ∈ ∂g(v). Since y k n = (Id + α k n ∂g) -1 (Idα k n ∇f )x k n , we have (Idα k n ∇f )x k n ∈ (Id + α k n ∂g)y k n , which gives 1 α k n x k ny k nα k n ∇f x k n ∈ ∂g y k n .
Since ∂g is maximal monotone, it follows that vy k n , u -∇f (v) -1 α k n x k ny k nα k n ∇f x k n ≥ 0.

This shows that
Since lim k→∞ x ky k = 0, by the assumption we have lim k→∞ ∇f (x k ) -∇f (y k ) = 0. Taking the limit as n → ∞ in (3.21), we have vx ∞ , u ≥ 0.
Thus 0 ∈ (∇f + ∂g)x ∞ , and consequently x ∞ ∈ S * . By Lemma 2.4 we conclude that (x k ) converges weakly to an element of S * . Thus we complete the proof.
Set k := k + 1, and go to Step 1.

Theorem 3.7
Let (x k ) and (α k ) be generated by Algorithm 3.6. Assume that there is α > 0 such that α k ≥ α > 0 for all k ∈ N. Then (x k ) weakly converges to an element of ∩ argmin(f + g).
Proof Let w * be a solution in ∩ argmin(f + g). Then using Lemma 2.1(ii), we have Similarly to Theorem 3.4, we can show that

From (3.23) and (3.24) we obtain
Thus lim k→∞ w kw * exists, and (w k ) is bounded. From (3.25) we see that Thus w ky k → 0, w kz k → 0, and P (z k )z k → 0 as k → ∞. Also, we can show that w kx k → 0 and y kx k → 0 as k → ∞. Let w ∞ ∈ ω w (w * ). As in Theorem 3.7, we can show that w ∞ ∈ argmin(f + g). On the other hand, since lim k→∞ P (z k )z k = 0 and z k w ∞ , by Lemma 2.3 we have w ∞ ∈ . Therefore w ∞ ∈ ∩ argmin(f + g). Using Lemma 2.4, we can conclude that Theorem 3.7 holds.

Numerical experiments
In this section, we apply our result to the signal recovery in compressive sensing and image inpainting. We compare the performance of our algorithms with those of Combettes and Wajs [9] and Cruz and Nghia [3].
The numerical experiments are performed by Matlab 2020b on a 64-bit MacBook Pro Chip Apple M1 and 8 GB of RAM.
We consider the following LASSO problem: N) is a bounded linear operator, y ∈ R M is the observed data, and λ > 0. Rewriting (4.1) as problem (1.1), we can set In experiment, y is generated by the Gaussian noise with SNR = 40, A is generated by normal distribution with mean zero and variance one, and x ∈ R N is generated by uniform distribution in [-2,2] that contains m nonzero components. The stopping criterion is defined by where x k is an estimated signal of x * . The initial point x 0 is chosen to be zero. Let α = 1 A 2 and λ k = 0.82 in Algorithm 1.1. Let σ = 7, δ = 0.02, θ = 0.15, and γ = 1.85 in Algorithms 1.2 and 3.1, respectively. We now present the corresponding numerical results (the number of iterations is denoted by Iter, and CPU denotes the time of CPU) using different numbers of inequality constraints m. The numerical results are shown in Table 1.
From Table 1 we see that the experiment result of Algorithm 3.1 is better than those of Algorithms 1.1 and 1.2 in terms of CPU time and number of iterations in all cases.
Next, we provide Fig. 1 to show the convergence of each algorithm via the graph of the MSE value and number of iterations and Fig. 2 to show signal recovery in compressed sensing when N = 1024, M = 512, and m = 70.
Next, we analyze the convergence and the effects of the stepsizes depending on the parameters σ , δ, θ , and γ in Algorithm 3.1.     In the first experiment, we investigate the effect of the parameter γ in the proposed algorithm. We intend to vary this parameter and study its convergence behavior. The numerical results are shown in Table 2.
From Table 2 we see that the CPU time and the number of iterations of Algorithm 3.1 decrease when the parameter γ approaches 2. We show numerical results for each case of γ in Fig. 3.
In the second experiment, we compare the performance of Algorithm 3.1 with different parameters θ in Theorem 3.4. Numerical results are shown in Table 3.
From Table 3 we observe that the CPU time of Algorithm 3.1 increases, but the number of iterations decreases when the parameter θ approaches 1. Figure 4 shows numerical results for each θ .
Next, we compare the performance of Algorithm 3.1 with different parameters σ in Theorem 3.4. Numerical results are reported in Table 4.
From Table 4 we observe that CPU increases when σ increases. However, there is no effect in terms of iterations.
Similarly, we obtain numerical results of Algorithm 3.1 with different δ in Table 5.   From Table 5 we see that the parameter δ has no effect in terms of the number of iterations and CPU time for both cases.
Next, we aim to apply our result for solving an image inpainting problem described by the following mathematical model: where x 0 ∈ R M×N (M < N) is a matrix with entries that lie in the interval [l, u], A is a linear map that selects a subset of the entries of an M × N matrix by setting each unknown entry in the matrix to 0, x is matrix of known entries A(x 0 ), and μ > 0 is a regularization parameter.
In particular, we consider the following image inpainting problem [10,11]: where · F is the Frobenius norm, and · * is the nuclear norm. Here we define P by The nuclear norm has been widely used in image inpainting and matrix completion problem, which is a convex relaxation of low rank constraint. It is obvious that the optimization problem (4.3) is related to (1.5). Indeed fact, let f (x) = 1 2 P (x) -P (x 0 ) 2 F and g(x) = μ x * . Then ∇f (x) = P (x) -P (x 0 ) is 1-Lipschitz continuous. The proximity operator of g(x) can be computed by the singular value decomposition (SVD) [5].
To evaluate the quality of the restored images, we use the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) [24] defined by PSNR = 20 log x F xx r F (4.5) and , (4.6) where x is the original image, x r is the restored image, u x and u x r are the mean values of the original image x and restored image x r , respectively, σ 2 x and σ 2 x r are the variances, σ 2 xx r is the covariance of two images, c 1 = (K 1 L) 2 and c 2 = (K 2 L) 2 with K 1 = 0.01 and K 2 = 0.03, and L is the dynamic range of pixel values. SSIM ranges from 0 to 1, with 1 meaning perfect recovery. The initial point x 0 is chosen to be zero. Let α = 1 A 2 and λ k = 0.82 in Algorithm 1.1. Let σ = 7, δ = 0.02, θ = 0.15, and γ = 1.85 in Algorithms 1.2 and 3.1, respectively. We obtain the following results.
From Table 6 we see that the experiment results of Algorithm 3.6 are better than those of Algorithms 1.1 and 1.2 in terms of PSNR and SSIM in all cases.
The original images are given in Fig. 5. The figures of inpainting images for the 250th and 350th iterations are shown in Figs. 6-7. The PSNR values and iterations are plotted in Fig. 8.
Next, we analyze the convergence and the effects of the stepsizes depending on the parameters σ , γ , θ , and δ in Algorithm 3.6.   First, we study the effect of the parameter σ in the proposed algorithm. The numerical results are shown in Table 7.
From Table 7 we observe that the PSNR and the SSIM of Algorithm 3.6 increase when the parameter σ increases. Figure 9 shows numerical results for various σ .   Next, we investigate the effect of the parameter γ in the proposed algorithm. We intend to vary this parameter and study its convergence behavior. The numerical results are shown in Table 8.
From Table 8 we observe that the PSNR and the SSIM of Algorithm 3.6 increase when the parameter γ approaches 2. Moreover, we see that CPU time decreases when the parameter γ approaches 2. Figure 10 shows numerical results for various γ .
Next, we study the effect of the parameter θ . The numerical results are shown in Table 9.
From Table 8 we observe that the PSNR, SSIM, and CPU time of Algorithm 3.6 increase when the parameter θ approaches 1. Figure 11 shows numerical results for various θ .
We next study the effect of the parameter δ. The results are shown in Table 10.
From Table 10 we observe that the PSNR and SSIM of Algorithm 3.6 increase when the parameter δ approaches 1/6. Moreover, we see that CPU time increases when the parameter δ approaches 0. Figure 12 shows numerical results for each δ.

Conclusion
In this work, we proposed new forward-backward algorithms for solving convex minimization problem. We proved weak convergence theorems under some weakened assumptions on the stepsize. Our algorithms do not require the Lipschitz constant of the gradient of functions. Moreover, we proposed a new projected forward-backward splitting algorithm using new linesearch to solve constrained convex minimization problem. As a result, it can be applied effectively to solve signal recovery and image inpainting. Our algorithms have a good performance in terms of iterations and CPU times. We also have discussed the effects of all parameters in our algorithms.