 Research
 Open Access
 Published:
Approximated leastsquares solutions of a generalized Sylvestertranspose matrix equation via gradientdescent iterative algorithm
Advances in Difference Equations volume 2021, Article number: 266 (2021)
Abstract
This paper proposes an effective gradientdescent iterative algorithm for solving a generalized Sylvestertranspose equation with rectangular matrix coefficients. The algorithm is applicable for the equation and its interesting special cases when the associated matrix has full columnrank. The main idea of the algorithm is to have a minimum error at each iteration. The algorithm produces a sequence of approximated solutions converging to either the unique solution, or the unique leastsquares solution when the problem has no solution. The convergence analysis points out that the algorithm converges fast for a small condition number of the associated matrix. Numerical examples demonstrate the efficiency and effectiveness of the algorithm compared to renowned and recent iterative methods.
Introduction
In differential equations and control engineering, there has been much attention for the following linear matrix equations:
These equations are special cases of a generalized Sylvestertranspose matrix equation:
where, for each \(t=1,\dots ,p\), \(A_{t}\in \mathbb{R}^{l\times m}\), \(B_{t}\in \mathbb{R}^{n\times r}\), for each \(s=1,\dots ,q\), \(C_{s}\in \mathbb{R}^{l\times n}\), \(D_{s}\in \mathbb{R}^{m\times r}\), \(E\in \mathbb{R}^{l\times r}\) are known matrices whereas \(X\in \mathbb{R}^{m\times n}\) is the matrix to be determined. These equations play important roles in control and system theory, robust simulation, neural network, and statistics; see e.g. [1–4].
A traditional method of finding their exact solutions is to use the Kronecker product of a matrix and the vectorization to reduce the matrix equation to a linear system; see e.g. [5, Ch. 4]. However, the dimension of the linear system can be very large due to the Kronecker multiplication, so that the step of finding the inversion of the associated matrix will result in excessive computer storage memory. For that reason, iterative approaches have received much attention. The conjugate gradient (CG) is an interesting idea to formulate finitestep iterative procedures to obtain the exact solution at the final step. There are variants of CG method for solving linear matrix equations, namely, the generalized conjugate direction method (GCD) [6], the conjugated gradient leastsquares method (CGLs) [7], generalized producttype methods based on a biconjugate gradient (GPBi) [8]. Another interesting idea to create an iterative method is to use Hermitian and skewHermitian splitting (HSS); see e.g. [9].
A group of methods, called gradientbased iterative methods, aim to construct a sequence of approximated solutions that converges to the exact solution for any given initial matrices. These methods are derived from the minimization of associated normerror functions using gradients, and the hierarchical identification. Such techniques have stimulated and have played a role in many pieces of research in a few decades. In 2005, Ding and Chen [10] proposed a gradientbased iterative (GI) method for solving Eqs. (3), (4), and (6). Ding et al. [11] proposed the GI and the leastsquares iterative (LSI) methods for solving \(\sum_{j=1}^{p}A_{j}XB_{j}=F\) which includes Eqs. (1) and (4). Niu et al. [12] developed a relaxed gradientbased iterative (RGI) method for solving Eq. (3) by introducing a weighted factor. The MGI method, developed by Wang et al. [13], is a halfstepupdate modification of the GI method. Zhaolu et al. [14] presented two methods for solving Eq. (3). The first method is based on the GI method and called the Jacobi gradient iterative (JGI) method. Furthermore, they introduced relaxation factors to accelerate the speed of convergence and called the accelerated Jacobi gradient iterative (AJGI) method. Recently, Sun et al. (2019, [15]) proposed two modified leastsquares iterative algorithms namely, LSIA1 [15, Theorem 2.3] and LSIA2 [15, Theorem 3.1] for the Lyapunov equation (2). See more algorithms in [16–24]. The developed iterative methods can be applied to statespace models [25], controlled autoregressive systems [26], and parameter estimation in signal processing [27].
Let us focus on gradientbased iterative methods for solving Eqs. (5) and (8). A recent gradient iterative method for Eq. (5) is AGBI method, developed in [28]. The following two methods were proposed to produce the sequence \(X(k)\) of approximated solutions converging to the exact solution \(X^{*}\) of Eq. (8).
Method 1.1
([29])
Gradient iterative (GI) method.
A conservative choice of the convergence factor μ is
Method 1.2
([29])
Leastsquares iterative (LSI) method.
In this work, we introduce a new iterative algorithm based on gradientdescent for solving Eq. (8). The techniques of gradient and steepest descent let us obtain the search direction and the step sizes. Indeed, our varied step sizes are the optimal convergence factors that guarantee the algorithm to have a minimum error at each iteration. Our convergence analysis proves that, when Eq. (8) has a unique solution, the algorithm constructs a sequence of approximated solutions converging to the exact solution. On the other hand, when Eq. (8) has no solution, the generated sequence converges to the unique leastsquares solution. We provide the convergence rate to show that the speed of convergence depends on the condition number of the associated certain matrix. In addition, we have an error analysis that gives an error estimation comparing the current iteration with the preceding and the initial iterations. Finally, we provide numerical simulations to guarantee the efficiency and effectiveness of our algorithm. The illustrative examples show that our algorithm is applicable to both Eq. (8) and its certain interesting special cases.
The organization of this paper is as follows. In Sect. 2, we recall the criterion for the matrix equation (8) to have a unique solution or a unique leastsquares solution, via the Kronecker linearization. We propose the gradientdescent algorithm to solve Eq. (8) in Sect. 3. The proof of convergence criteria, convergence rates, and error estimation for the proposed algorithm are provided in Sect. 4. In Sect. 5, we present the comparison of the efficiency of our proposed algorithm to wellknown and recent iterative algorithms.
In the remainder of this paper, all vectors and matrices are real. Denote the set of n columns vectors by \(\mathbb{R}^{n}\) and the set of \(m \times n\) matrices by \(\mathbb{R}^{m \times n}\). The \((i,j)\)th entry of a matrix A is denoted by \(A(i,j)\) or \(a_{ij}\). To perform a convergence analysis, we use the Frobenius norm, the spectral norm, and the (spectral) condition number of \(A\in \mathbb{R}^{m \times n}\), which are, respectively, defined by
Exact and leastsquares solutions of the matrix equation by the Kronecker linearization
In this section, we explain how to solve the generalized Sylvestertranspose matrix equation (8) directly using the Kronecker linearization.
Recall that the Kronecker product of \(A=[a_{ij}]\in \mathbb{R}^{m \times n}\) and \(B\in \mathbb{R}^{p \times q}\) is defined by \(A \otimes B = [a_{ij} B] \in \mathbb{R}^{mp\times nq}\). The vector operator \(\operatorname{Vec}(\cdot )\) turns each matrix \(A=[a_{ij}]\in \mathbb{R}^{m\times n}\) to the vector
Lemma 2.1
(e.g. [5])
For compatible matrices A, B, and C, we have the following properties of the Kronecker product and the vector operator.

(i)
\((A\otimes B)^{T} = A^{T} \otimes B^{T}\),

(ii)
\(\operatorname{Vec}(ABC) = (C^{T}\otimes A)\operatorname{Vec}(B)\).
Recall also that there is a permutation matrix \(P(m,n)\in \mathbb{R}^{mn\times mn}\) such that
This matrix depends only on the dimensions m and n and is given by
where \(E_{ij}\) has entry 1 in \((i,j)\)th position and all other entries are 0.
Now, we can transform Eq. (8) to an equivalent linear system by applying the vector operator and utilizing Lemma 2.1(ii) and the property (9). Indeed, we get the linear system
where
Thus Eq. (8) has a (unique) solution if and only if Eq. (10) does. We impose the assumption that Q is of full columnrank, or equivalently, \(Q^{T} Q\) is invertible.
If Eq. (8) has a solution, then we obtain the exact (vector) solution to be
If Eq. (8) has no solution, then we can seek for a leastsquares solution, i.e. a matrix \(X^{*}\) that minimizes the squared Frobenius norm \(\Vert Q\operatorname{Vec}(X)\operatorname{Vec}(E) \Vert _{F}^{2}\). The assumption on Q implies that the leastsquares solution for Eq. (8) is uniquely determined by the solution of the associated normal equation, and it is also given by Eq. (12). In this case, the leastsquares error is given by
We denote both the exact and the leastsquares solutions of Eq. (8) by \(X^{*}\).
Gradientdescent iterative solutions for the matrix equation
This section is intended to propose a new iterative algorithm for creating a sequence \(\{X_{k}\}\) of wellapproximated solutions of Eq. (8) that converges to the exact or leastsquares solution \(X^{*}\). This algorithm will be applicable if the matrix Q is of full columnrank, no matter Eq. (8) has a solution or not.
Our aim is to generate a sequence \(\{ x_{k}\}\), starting from an initial vector \(x_{0}\), using the recurrence
where \(x_{k}\) is the kth approximation, \(\tau _{k+1}>0\) is the step size, and \(d_{k}\) is the search direction. To obtain the search direction, we consider the Frobeniusnorm error \(\Vert \sum_{t=1}^{p} A_{t}XB_{t}+\sum_{s=1}^{q} C_{s}X^{T}D_{s}  E \Vert _{F}\) which is then transformed into \(\Vert Qx\operatorname{Vec}(E)\Vert _{F}\) via Lemma 2.1(ii) and \(x=\operatorname{Vec}(X)\). Let \(f: \mathbb{R}^{mn}\rightarrow \mathbb{R}\) be the normerror function defined by
It is easily seen that f is convex. Hence, the gradientdescent iterative method can be shown as the following recursive equation:
To find the gradient of the function f, the following properties of the matrix trace will be used:
By letting \(\tilde{e}=\operatorname{Vec}(E)\), we compute the derivative of f as follows:
Thus, we have the new form of the iterative equation as follows:
The above equation can be transformed into matrix form via Lemma 2.1(ii), i.e.,
where \(R_{k} = E  \sum_{t=1}^{p}A_{t}X_{k}B_{t}\sum_{s=1}^{q}C_{s}X_{k}^{T}D_{s}\).
To choose a step size, we define \(\phi _{k+1}:[0,\infty )\rightarrow \mathbb{R}\) by for each \(k \in \mathbb{N}\cup \{ 0 \}\),
We differentiate \(\phi _{k+1}\) by using the properties of a matrix trace and obtain
It is obvious that the secondorder derivative of \(\phi _{k+1}\) is \(\Vert QQ^{T}(\tilde{e}Qx_{k})\Vert _{F}^{2}\) which is a positive constant. So when \(\frac{d}{d\tau }\phi _{k+1}(\tau ) = 0\), we get the minimizer of \(\phi _{k+1}\), i.e.
Here \(W_{k} = \sum_{t=1}^{p} A_{t}^{T}R_{k}B_{t}^{T}+\sum_{s=1}^{q} C_{s}^{T}R_{k}D_{s}^{T}\).
An implementation of the gradientdescent iterative algorithm for solving Eq. (8) is given by the following algorithm where the search direction and the step size are taken into account. To terminate the algorithm, one can alternatively set the stopping rule to be \(\Vert R_{k}\Vert _{F}  \delta <\epsilon \) where \(\epsilon >0\) is a small error and δ is the leastsquares error described in Eq. (13).
Convergence analysis of the proposed algorithm
In this section, Algorithm 1 will be proved to converge to the exact solution or the unique leastsquares solution. Recall the next lemma.
Lemma 4.1
([30])
Let \(f:\mathbb{R}^{n}\rightarrow \mathbb{R}\) be a strongly convex function, i.e. there exist two nonnegative constants ψ, Ψ such that \(\psi I\leqslant \nabla ^{2} f(x)\leqslant \Psi I\) for all \(x\in \mathbb{R}^{n}\). Then, for any \(x,y\in \mathbb{R}^{n}\),
The following definition is an extension of the Frobenius norm and will be used in the convergence analysis.
Definition 4.2
Given a fullcolumnrank matrix \(P \in \mathbb{R}^{k\times n}\), we define the Pweighted Frobenius norm of \(A\in \mathbb{R}^{m\times n}\) by
Theorem 4.3
Consider Eq. (8). Assume that Q is of full columnrank.

(i)
Suppose Eq. (8) has a solution (and thus, the solution is unique). Then, for any initial matrix \(X_{0}\), the sequence \(X_{k}\) of approximated solutions generated by Algorithm 1 converges to the exact solution \(X^{*}\).

(ii)
Suppose Eq. (8) has no solution (and thus, it has the unique leastsquares solution \(X^{*}\)). Then \(\Vert X_{k}\Vert _{Q} \to \Vert X^{*}\Vert _{Q}\) for any initial matrix \(X_{0}\). Here, \(\Vert \cdot \Vert _{Q}\) is the Qweighted Frobenius norm defined by Eq. (17).
Proof
Since \(x^{*} = \operatorname{Vec}(X^{*})\) is the optimal solution of \(\min_{x\in \mathbb{R}^{mn}}f(x)\), we denote the minimum value, \(\inf_{x\in \mathbb{R}^{mn}}f(x) = f(x^{*})\) as δ. Note that δ is equal to the leastsquares error determined by Eq. (13) and is zero if \(X^{*}\) is the unique exact solution. If there exists \(k\in \mathbb{N}\) such that \(\nabla f(x_{k})= 0\), then \(X_{k}=X^{*}\) and the result holds. To investigate the convergence of the algorithm, we assume that \(\nabla f(x_{k})\neq 0\) for all k. Considering the strong convexity of f, we have from Eq. (14) \(\nabla ^{2}f(x_{k}) = Q^{T}Q\). Let \(\lambda _{\min }\) (\(\lambda _{ \max }\)) be the minimum (maximum) eigenvalue of \(Q^{T}Q\), respectively. Since \(Q^{T}Q\) is symmetric, we have
Thus, f is strongly convex. From (15), substituting \(y = x_{k+1}\) and \(x = x_{k}\) yields
We minimize the RHS by taking \(\tau =1/\lambda _{\min }\), so that
Since the above equation is true for all \(y\in \mathbb{R}^{mn}\), we have
Similarly, from (16), we have
Minimizing the RHS by taking \(\tau = 1/\lambda _{\max }\) yields
Subtracting each side of (19) by δ and combining with \(\Vert \nabla f(x_{k})\Vert ^{2}_{F}\geqslant 2\lambda _{\min }(f(x_{k}) \delta )\) (from (18)), we get
Putting \(\alpha :=1\lambda _{\min }/\lambda _{\max }\), we have
By induction, we obtain
Since \(Q^{T}Q\) is assumed to be invertible, \(Q^{T}Q>0\), it follows that \(\lambda _{\min }>0\) and hence \(0<\alpha <1\). Thus, \(f(x_{k})\delta \to 0\), or equivalently, \(f(x_{k}) \to \delta \) as \(k \to \infty \).
Consider the case of \(X^{*}\) is the unique exact solution, i.e., \(\delta =0\). We have \(f(x_{k})\to 0\), or equivalently \(Q x_{k}  \operatorname{Vec}(E) \to 0\) as \(k\rightarrow \infty \). Now, the assumption that Q is of full columnrank implies that
Therefore, \(X_{k} = \operatorname{Vec}^{1}(x_{k}) \to X^{*}\) as \(k \to \infty \).
The other case is that \(X^{*}\) is the unique leastsquares solution, i.e., \(\delta >0\). We have \(f(x_{k}) \to \delta \) or \(\frac{1}{2}\Vert Qx_{k}\operatorname{Vec}(E)\Vert ^{2}_{F} \to \Vert \operatorname{Vec}(E)\Vert ^{2}_{F}\operatorname{Vec}(E)^{T}Qx^{*}\). Then
We omit some algebraic operations and hence immediately write
Therefore, \(\Vert X_{k}\Vert _{Q}\to \Vert X^{*}\Vert _{Q}\) as \(k\to \infty \). □
We denote the condition number of Q by \(\kappa = \kappa (Q)\). Observe that \(\alpha = 1\kappa ^{2}\). The relation between the quadratic normerror \(f(x_{k})\) and the norm of residual error \(\Vert R_{k}\Vert \) is given by
Making use of Lemma 2.1(ii), the inequalities (20) and (21) become the following estimation:
In the case of Eq. (8) having a unique exact solution \((\delta = 0)\), the error estimations (22) and (23) reduce to (24) and (25), respectively.
Since \(0<\alpha <1\), it follows that, if \(\Vert R_{k1}\Vert _{F}\) are nonzero, then
The above discussion is summarized in the following theorem.
Theorem 4.4
Assume that Q is of full columnrank.

(i)
Suppose Eq. (8) has a unique solution. The error estimation \(\Vert R_{k}\Vert _{F}\) compared with \(\Vert R_{k1}\Vert _{F}\) (the preceding iteration) and \(\Vert R_{0}\Vert _{F}\) (the initial iteration) are given by (24) and (25), respectively. Particularly, the relative error \(\Vert R_{k}\Vert _{F}\) gets smaller than the preceding (nonzero) error, as in (26).

(ii)
When Eq. (8) has a unique leastsquares solution, the error estimation (22) and (23) hold.
In both cases, the convergence rate of Algorithm 1 (regarding the error \(\Vert R_{k}\Vert _{F}\)) is governed by \(\sqrt{1\kappa ^{2}}\).
Remark 4.5
The relative errors (22) and (23) do not seem to decrease every step of iteration since the terms \(2\delta \kappa ^{2}\) and \(2\delta (1\alpha ^{k})\) are positive. However, the inequality (19) implies that \(\{\Vert R_{k}\Vert _{F}\}_{k=1}^{\infty }\) is a strictly decreasing sequence converging to δ.
We recall the following properties.
Lemma 4.6
(e.g. [5])
For any compatible matrices A and B, we have

(i)
\(\Vert A^{T}A\Vert _{2}=\Vert A\Vert ^{2}_{2}\),

(ii)
\(\Vert A^{T}\Vert _{2} = \Vert A\Vert _{2}\),

(iii)
\(\Vert AB\Vert _{F}\leqslant \Vert A\Vert _{2}\Vert B\Vert _{F}\).
Theorem 4.7
Suppose that Q is of full columnrank and Eq. (8) has a unique exact solution. We have the error estimation \(\Vert X_{k}X^{*}\Vert _{F}\) compared with the preceding iteration and the initial iteration of Algorithm 1 are provided by
Particularly, the convergence rate of the algorithm is governed by \(\sqrt{1\kappa ^{2}}\).
Proof
Utilizing (25) and Lemma 4.6, we have
As the limiting behavior of \(\Vert X_{k}X^{*}\Vert _{F}\) depends on \(( 1\kappa ^{2})^{\frac{k}{2}}\), the convergence rate for Algorithm 1 is governed by \(\sqrt{1\kappa ^{2}}\). Similarly, using (24), it follows that
and hence (28) is obtained. □
Theorem 4.8
Suppose Q is of full columnrank and Eq. (8) has a unique leastsquares solution. The error estimation \(\Vert X_{k}X^{*}\Vert ^{2}_{F}\) compared to the preceding iteration and the initial iteration of Algorithm 1 are provided by
Proof
The proof is similar to that of Theorem 4.7 and carried out by (22) and (23). We, therefore, omit the proof. □
Consequently, our convergence analysis indicates that the proposed algorithm always converges to the unique (exact or leastsquares) solution for any initial matrices and small condition numbers. Moreover, the algorithm will converge fast when the condition number is close to 1.
Numerical experiments for the generalized Sylvestertranspose matrix equation and its special cases
In this section, we provide numerical results to show the efficiency and effectiveness of Algorithm 1. We perform the experiments in the following cases:

a largescaled square generalized Sylvestertranspose equation,

a smallscaled rectangular generalized Sylvestertranspose equation,

a smallscaled square Sylvestertranspose equation,

a largescaled square Sylvester equation,

a moderatescaled square Lyapunov equation.
Each example contains some comparisons of the proposed algorithm (denoted by TauOpt) with the mentioned existing algorithms as well as the direct method Eq. (12). CT stands for the computational time (in seconds) and is measured by the tic toc function in MATLAB. The relative error \(\Vert R_{k}\Vert _{F}\) is used to measure error at the kth step of the iteration. All iterations have been evaluated by MATLAB R2020b, on a PC (2.60GHz intel(R) Core(TM) i7 processor, 8 Gbyte RAM).
Example 5.1
Consider a generalized Sylvestertranspose matrix equation
with \(100 \times 100\) coefficient matrices:
We choose an initial matrix \(X_{0} = \operatorname{zero}(100)\), where \(\operatorname{zero}(n)\) is the \(n\times n\) zero matrix. In fact, this equation has the unique solution
Table 1 shows that the direct method consumes a big amount of time to get the exact solution, while Algorithm 1 produces a smallerror solution in a small time (0.1726 seconds after 100 iterations). We compare the efficiency of Algorithm 1 with another existing gradientbased iterative algorithms, namely, GI (Method 1.1) and LSI (Method 1.2). Figure 1 displays the error plot which supports the theoretical results i.e., the sequence of errors generated by Algorithm 1 is monotone decreasing. Table 1 indicates that our algorithm performs well in computational time.
Example 5.2
Consider the equation
with the rectangular coefficient matrices as follows:
We find that \(4=\operatorname{rank}Q \neq \operatorname{rank}[Q \; \operatorname{Vec}(E)] = 5\), i.e., the matrix equation does not have an exact solution. However, the size of Q is \(9\times 4\), i.e., Q is of fullcolumn rank. Hence, according to Theorem 4.3, Algorithm 1 will converge to the leastsquares solution in which the leastsquares error (13) is equal to 0.0231. We choose an initial matrix \(X_{0} =\operatorname{zero}(2)\). Algorithm 1 is compared with GI (Method 1.1), LSI (Method 1.2) and the direct method Eq. (12). In this case, we consider the error \(\Vert X^{*}X_{k}\Vert _{F}\) where \(X^{*}\) is the leastsquares solution. Figure 2 displays the error plot, and Table 2 shows the errors and CTs for TauOpt, GI, LSI and the direct method. We see that the errors converge monotonically to zero, i.e., the approximate solutions \(X_{k}\) generated by Algorithm 1 converge to \(X^{*}\). Moreover, Algorithm 1 consumes less computational time than other methods.
Next, we will consider the Sylvestertranspose equation (5) which is a special case of the generalized Sylvestertranspose equation (8). From Algorithm 1, the optimal step size τ is described by
where \(W_{k} = A^{T}R_{k}B^{T}+C^{T}R_{k}D^{T}\) and \(R_{k}=EAX_{k}BCX_{k}^{T}D\).
Example 5.3
Let us consider the Sylvestertranspose equation (5) with
Choosing \(X_{0}=\text{zero}(4)\), then the sequence of numerical solutions generated by Algorithm 1 converges to the exact solution,
We report the comparison of Algorithm 1 with GI (Method 1.1), LSI (Method 1.2), AGBI ([28]) and the direct method Eq. (12) by Fig. 3 and Table 3. Both of them imply that Algorithm 1 outperforms other algorithms.
Next, we will consider the Sylvester equation (3) which is also a special case of Eq. (8). For this equation, the optimal step size τ is described by
where \(W_{k} = A^{T}R_{k}+R_{k}B^{T}\) and \(R_{k} = CAX_{k}X_{k}B\).
Example 5.4
Suppose that the Sylvester equation (3) has largescaled tridiagonal coefficient matrices, i.e.,
where \(A,B,C\in \mathbb{R}^{100\times 100}\). We choose an initial matrix \(X_{0}=\operatorname{zero}(100)\). Here, the symmetric exact solution is given by \(X^{*} = \operatorname{tridiag}(1,5,1)\), so that AGBI algorithm can be applicable. We compare Algorithm 1 with GI (Method 1.1), AGBI ([28]), RGI [12], MGI [13], JGI [14], and AJGI [14]. Although Table 4 tells us that our algorithm takes a slightly more time than some other algorithms, Fig. 4 illustrates that Algorithm 1 reaches the fastest convergence.
The last example presents another special case of Eq. (8) that is the Lyapunov equation (2). The optimal step size τ is described by
where \(W_{k} = A^{T}R_{k}+R_{k}A\) and \(R_{k} = BAX_{k}X_{k}A^{T}\).
Example 5.5
We consider the Lyapunov equation (2) with mediumscale coefficient matrices
We choose \(n=20\) and set \(X_{0}=\operatorname{zero}(20)\). Algorithm 1 is compared with GI, RGI, MGI, AGBI, JGI, AJGI, LSIA1, and LSIA2 methods. We report the results in Fig. 5 and Table 5. In conclusion, Algorithm 1 takes a slightly more computational time than some other algorithms but still outperforms distinctly in performance of convergence.
Concluding remarks
We properly establish a gradientdescent iterative algorithm for solving the generalized Sylvestertranspose matrix equation (8). We show that the proposed algorithm is useful and applicable for wide range of problems, even though the problem has no solution, as long as the associated matrix Q, defined by Eq. (11), is of full columnrank. If the problem has the unique exact solution, then the approximate solutions converge to the exact solution. In the case of a nosolution problem, we have \(\Vert X\Vert _{Q} \to \Vert X^{*}\Vert _{Q}\) where \(X^{*}\) is the unique leastsquares solution. The convergence rate is described in terms of κ, the matrix condition number of Q, that is, \(\sqrt{1\kappa ^{2}}\). Moreover, the analysis shows that the sequence of errors generated by our algorithm is monotone decreasing. Numerical examples are provided to verify our theoretical findings.
Availability of data and materials
Not applicable.
References
 1.
Geir, E.D., Fernando, P.: A Course in Robust Control Theory: A Convex Approach. Springer, New York (1999)
 2.
Varga, A.: Robust pole assignment via Sylvester equation based state feedback parametrization. In: Proceedings of the 2000 IEEE International Symposium on ComputerAided Control System, pp. 13–18. Design, Alsaka (2000)
 3.
Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometries, 3rd edn. Wiley, Chichester (2007)
 4.
Nouri, K., Beik, S.P.A., et al.: An iterative algorithm for robust simulation of the Sylvester matrix differential equations. Adv. Differ. Equ. 2020(1), Article ID 287 (2020). https://doi.org/10.1186/s1366202002757z
 5.
Horn, R., Johnson, C.: Topics in Matrix Analysis. Cambridge University Press, New York (1991)
 6.
Hajarian, M.: Generalized conjugate direction algorithm for solving the general coupled matrix equations over symmetric matrices. Numer. Algorithms 73(3), 591–609 (2016)
 7.
Hajarian, M.: Extending the CGLS algorithm for least squares solutions of the generalized Sylvestertranspose matrix equations. J. Franklin Inst. 353(5), 1168–1185 (2016)
 8.
Dehghan, M., Mohammadi–Arani, R.: Generalized producttype methods based on Biconjugate gradient(GPBiCG) for solving shifted linear systems. Comput. Appl. Math. 36(4), 1591–1606 (2017)
 9.
Bai, Z.: On Hermitian and skewHermitian splitting iteration methods for continuous Sylvester equation. J. Comput. Math. 29(2), 185–198 (2011). https://doi.org/10.4208/jcm.1009m3152
 10.
Ding, F., Chen, T.: Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 50(8), 1216–1221 (2005). https://doi.org/10.1109/TAC.2005.852558
 11.
Ding, F., Liu, X.P., Ding, J.: Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 197(1), 41–50 (2008). https://doi.org/10.1016/j.amc.2007.07.040
 12.
Niu, Q., Wang, X., Lu, L.Z.: A relaxed gradient based algorithm for solving Sylvester equation. Asian J. Control 13(3), 461–464 (2011). https://doi.org/10.1002/asjc.328
 13.
Wang, X., Dai, L., Liao, D.: A modified gradient based algorithm for solving Sylvester equation. Appl. Math. Comput. 218(9), 5620–5628 (2012). https://doi.org/10.1016/j.amc.2011.11.055
 14.
Tian, Z., Tian, M., et al.: An accelerated Jacobigradient based iterative algorithm for solving Sylvester matrix equations. Filomat 31(8), 2381–2390 (2017). https://doi.org/10.2298/FIL1708381T
 15.
Sun, M., Wang, Y., Liu, J.: Two modified leastsquares iterative algorithms for the Lyapunov matrix equations. Adv. Differ. Equ. 2019, 305 (2019). https://doi.org/10.1186/s1366201922537
 16.
Ding, F., Chen, T.: Hierarchical gradientbased identification of multivariable discretetime systems. Automatica 41(2), 315–325 (2005). https://doi.org/10.1016/j.automatica.2004.10.010
 17.
Ding, F., Chen, T.: Hierarchical least squares identification methods for multivariable systems. IEEE Trans. Autom. Control 50(3), 397–402 (2005). https://doi.org/10.1109/TAC.2005.843856
 18.
Wu, A., Duan, G., Zhou, B.: Solution to generalized Sylvester matrix equations. IEEE Trans. Autom. Control 53(3), 811–815 (2008). https://doi.org/10.1109/TAC.2008.919562
 19.
Xie, L., Liu, Y., Yang, H.: Gradient based and least squares based iterative algorithms for matrix equations \(AXB+CX^{T}D=F\). Appl. Math. Comput. 217(5), 2191–2199 (2010). https://doi.org/10.1016/j.amc.2010.07.019
 20.
Zhang, X., Sheng, X.: The relaxed gradient based iterative algorithm for the symmetric (skew symmetric) solution of the Sylvester equation \(AX+XB=C\). Math. Probl. Eng. 2017, 1–8 (2017). https://doi.org/10.1155/2017/1624969
 21.
Kittisopaporn, A., Chansangiam, P.: The steepest descent of gradientbased iterative method for solving rectangular linear systems with an application to Poisson’s equation. Adv. Differ. Equ. 2020(1), Article ID 259 (2020). https://doi.org/10.1186/s13662020027159
 22.
Boonruangkan, N., Chansangiam, P.: Gradient iterative method with optimal convergent factor for solving a generalized Sylvester matrix equation with applications to diffusion equations. Symmetry 12(10), Article ID 1732 (2020). https://doi.org/10.3390/sym12101732
 23.
Sasaki, N., Chansangiam, P.: Modified Jacobi–gradient iterative method for generalized Sylvester matrix equation. Symmetry 12(11), Article ID 1831 (2020). https://doi.org/10.3390/sym12111831
 24.
Kittisopaporn, A., Chansangiam, P., Lewkeeratiyutkul, W.: Convergence analysis of gradient–based iterative algorithms for a class of rectangular Sylvester matrix equations based on Banach contraction principle. Adv. Differ. Equ. 2021(1), Article ID 17 (2021). https://doi.org/10.1186/s13662020031859
 25.
Ding, F., Zhang, X., Xu, L.: The innovation algorithms for multivariable statespace models. Int. J. Adapt. Control Signal Process. 33, 1601–1618 (2019). https://doi.org/10.1002/acs.3053
 26.
Ding, F., Lv, L., Pan, J., et al.: Twostage gradientbased iterative estimation methods for controlled autoregressive systems using the measurement data. Int. J. Control. Autom. Syst. 18, 886–896 (2020). https://doi.org/10.1007/s1255501901403
 27.
Ding, F., Xu, L., Meng, D., et al.: Gradient estimation algorithms for the parameter identification of bilinear systems using the auxiliary model. J. Comput. Appl. Math. 369, 112575 (2020). https://doi.org/10.1016/j.cam.2019.112575
 28.
Xie, Y.J., Ma, C.F.: The accelerated gradient based iterative algorithm for solving a class of generalized Sylvestertranspose matrix equation. Appl. Math. Comput. 273, 1257–1269 (2016). https://doi.org/10.1016/j.amc.2015.07.022
 29.
Xie, L., Ding, J., Ding, F.: Gradient based iterative solutions for general linear matrix equations. Comput. Math. Appl. 58(7), 1441–1448 (2009). https://doi.org/10.1016/j.camwa.2009.06.047
 30.
Stephen, P.B., Lieven, V.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Acknowledgements
The first author received financial support from KMITL Doctoral Scholarships, grant no. KDS 2019/022 during his Ph.D. study.
Funding
Not applicable.
Author information
Affiliations
Contributions
Writing–original draft preparation, A.K.; writing–review and editing, P.C.; data curation, A.K.; supervision, P.C. All authors contributed equally and significantly in writing this article. All authors have read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kittisopaporn, A., Chansangiam, P. Approximated leastsquares solutions of a generalized Sylvestertranspose matrix equation via gradientdescent iterative algorithm. Adv Differ Equ 2021, 266 (2021). https://doi.org/10.1186/s13662021034274
Received:
Accepted:
Published:
MSC
 15A60
 15A69
 26B25
 65F45
Keywords
 Generalized Sylvestertranspose matrix equation
 Gradient descent
 Iterative method
 Leastsquares solution