Skip to main content

Theory and Modern Applications

Infinite horizon linear quadratic optimal control for stochastic difference time-delay systems

Abstract

The aim of this paper is to investigate the infinite horizon linear quadratic (LQ) optimal control for stochastic time-delay difference systems with both state and control dependent noise. To do this, the notion of exact observability of a stochastic time-delay deference system is introduced and its PBH criterion is presented by the spectrum of an operator related with stochastic time-delay deference systems. Under the assumptions of stabilization and exact observability, it is shown that the optimal control law and optimal value exist, and also the properties of the associated general algebraic Ricatti equation (GARE) are discussed.

1 Introduction

As is well known, the optimal linear quadratic regulation (LQR) problem was initiated by Kalman in [1], which is one of the most important optimal control problems. In [2, 3], the authors further investigated the LQR problem of the deterministic case. In [4], Wonham first studied stochastic linear quadratic (LQ) control for Itô systems. In [5], the authors investigated LQ optimal control when the state and control weighting matrices Q and R are indefinite, and they proved the stochastic LQ optimal control may be still well posed. The discrete-time stochastic LQ problem involving state and control dependent noises has been introduced in [6]. Virtually most of the studies on optimal control in time-delay systems consider only delays in the state. By exploiting the dynamic programming approach, the authors presented a solution to the stochastic LQR problem for systems with input delay and stochastic parameter uncertainties in [7]. This paper will discuss the infinite horizon linear quadratic regulation problem for discrete-time stochastic systems with input delay and state delay. In order to guarantee the well posedness of the quadratic performance and the existence of the feedback stabilizing control law, we shall introduce some concepts such as stabilizability and exact observability, as regards which similar definitions have been well defined in [8] for stochastic Itô systems. By exact observability, we are able to discuss the infinite horizon stochastic LQ problem as well as the properties of the related generalized algebraic Ricatti equation (GARE). It is worth pointing out that, similar to the continuous context [9], stabilizability and exact observability will play an important role in discussing other problems, such as stochastic time-delay difference \(H_{2}/H_{\infty}\) control.

For stochastic time-delay difference systems, we concentrate our attention upon infinite horizon linear quadratic optimal control. This paper is organized as follows. In Section 2, by the Lyapunov equation and the H-presentation, we give the equivalent condition for the stabilizability of stochastic time-delay systems. We introduce the definition of exact observability of time-delay systems, and under the exact observability, we give an equivalent condition for the stabilizability of stochastic time-delay systems. In Section 3, under assumptions of stabilization and exact observability, we prove that the optimal control law and optimal value exist of stochastic time-delay difference systems.

To avoid confusion, we fix the following traditional notation. \(R^{n\times n}\): the set of all real matrices; \(S^{n}\): the set of all symmetric matrices; \(N=\{0,1,2,\ldots\}\); \(A'(\operatorname{Ker}(A))\): the transpose (kernel space) of a matrix A; \(A\geq0\) (\(A>0\)) is a positive semidefinite (positive definite) symmetric matrix A; I: identity matrix; \(\sigma(L)\): spectral set of the operator or matrix L; \(D(0,\alpha)=\{\lambda\mid\| \lambda\|<\alpha\}\); \(\|\cdot\|\) is the \(l_{2}\)-norm ⋅ ; \(L_{\mathcal{F}_{t}}^{2}(R^{+},R^{n_{v}})\): space of nonanticipative stochastic processes \(x(t)\in R^{n_{v}}\) with respect to an increasing σ-algebra \(\{\mathcal{F}_{t}\}_{t\geq0}\) satisfying \(E\| x(t)\|^{2}<\infty\). Finally, we make the assumption throughout this paper that all systems have real coefficients.

2 Stabilizability and exact observability

In this section, we introduce a general Lyapunov operator and the notion of exact observability of stochastic time-delay deference systems. By the spectrum of the general Lyapunov operator, we present the PBH criterion of exact observability of stochastic time-delay systems. By the Lyapunov functional approach and the H-representation in [10], some sufficient and necessary conditions of the asymptotical mean square stabilization of stochastic time-delay systems are given.

Consider the initial-value problem for the following linear difference time-delay system:

$$ \left \{ \begin{array}{@{}l} x(t+1)=F_{0}x(t)+M_{0}u(t)+(G_{0}x(t)+N_{0}u(t))w(t)+\sum_{j=1}^{m}[F_{j}x(t-j)\\ \hphantom{x(t+1)=}{}+M_{j}u(t-j)+(G_{j}x(t-j)+N_{j}u(t-j))w(t)], \\ x(\theta) = \varphi(\theta)\in R^{n},\quad \theta=0,-1,\ldots,-m, t\in N, \end{array} \right . $$
(2.1)

here \(x\in R^{n}\) is a column vector, \(F_{j}, G_{j},M_{j},N_{j}\in R^{n\times n}\), \(j=0,1,\ldots,m\), are constant coefficient matrices, \(u(t)\in R^{n}\) is the control input, \(\{w(t) \in R, t \in N\}\) is a sequence of real random variables defined on a complete probability space \(\{\Omega, \mathcal{F},\mathcal{F}_{t},\mu\} \) which is in a wide sense a stationary, second-order process with \(E(w(t))=0\) and \(E(w(t)w(s)) =\delta_{s,t} \), where \(\delta_{s,t}\) is the Kronecker delta with \(\mathcal{F}_{t}=\sigma\{w(s): 0\leq s \leq t\}\). \(\varphi (\theta)\in R^{n}\), \(\theta=0,-1,\ldots,-m\) are deterministic column vectors.

Definition 2.1

The trivial stationary solution \(x=0\) of system (2.1) is called mean square stabilization if there exists an input feedback K such that for any arbitrarily small number \(\varepsilon>0\), one can find a number \(\delta >0\), when \(\sup_{\theta\in[-m,0]}\|\varphi(\theta)\|<\delta\), such that

$$E\| x\|^{2}< \varepsilon $$

for a solution \(x(t)=x(t,x(\theta))\) satisfying the initial values \(x(\theta)=\varphi(\theta)\in R^{n}\), \(\theta=0,-1,\ldots,-m\).

Definition 2.2

The trivial stationary solution \(x=0\) of system (2.1) is called asymptotical mean square stabilization in the mean square if it is stable in the sense of Definition 2.1 and

$$\lim_{t\rightarrow+\infty}E\| x\|^{2}=0. $$

For a state feedback control law \(u(t)=Kx(t)\), we introduce a linear operator \(\mathcal{L}_{\overline {K}}\) associated with the closed-loop system

$$ \left \{ \begin{array}{@{}l} x(t+1)=(F_{0}+M_{0}K)x(t)+(G_{0}+N_{0}K)x(t)w(t)+\sum_{j=1}^{m}[(F_{j}x(t-j)\\ \hphantom{x(t+1)=}{}+M_{j}Kx(t-j))+(G_{j}x(t-j)+N_{j}Kx(t-j))w(t)],\\ x(\theta) = \varphi(\theta)\in R^{n}, \quad \theta=0,-1,\ldots,-m, t\in N. \end{array} \right . $$
(2.2)

Let \(\overline{x}(t)=[x'(t),x'(t-1),\ldots,x'(t-m)]'\) and \(\overline {u}(t)=[u'(t),u'(t-1),\ldots,u'(t-m)]'\). System (2.1) can now be written in the form of an equivalent stochastic system of dimension \(n(m + 1)\), namely,

$$ \overline{x}(t+1)=\overline{F}\overline{x}(t)+\overline{M}\overline {u}(t)+ \bigl(\overline{G}\overline{x}(t)+\overline{N}\overline{u}(t) \bigr)\omega (t), $$
(2.3)

where

$$\begin{aligned}& \overline{F}= \begin{pmatrix} F_{0}& F_{1}&\cdots&F_{m-1}&F_{m}\\ I &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&I &0 \end{pmatrix},\qquad \overline{G}= \begin{pmatrix} G_{0}& G_{1}&\cdots&G_{m-1}&G_{m}\\ 0 &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix}, \\& \overline{M}= \begin{pmatrix} M_{0}&M_{1}&\cdots&M_{m-1}&M_{m}\\ 0 &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix},\qquad \overline{N}= \begin{pmatrix} N_{0}& N_{1}&\cdots&N_{m-1}&N_{m}\\ 0&0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix}. \end{aligned}$$

Take a control input \(\overline{u}(t)=\overline{K}\overline{x}(t)\) with

$$\overline{K}= \begin{pmatrix} K&0&\cdots&0&0\\ 0 &K&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &K \end{pmatrix}, $$

and let \(X(t)=E\overline{x}(t)\overline{x}'(t)\), system (2.2) can now be written in the following form:

$$ X(t+1)=(\overline{F}+\overline{M}\overline{K})X(t) (\overline {F}+\overline{M} \overline{K})'+(\overline{G}+\overline{N}\overline {K})X(t) ( \overline{G}+\overline{N}\overline{K})'. $$
(2.4)

Now we introduce an operator

$$\mathcal{L}_{\overline{K}}:X\in S^{n(m+1)}\rightarrow(\overline {F}+ \overline{M}\overline{K})X(\overline{F}+\overline{M}\overline{K})' +( \overline{G}+\overline{N}\overline{K})X(\overline{G}+\overline {N} \overline{K})'\in S^{n(m+1)}. $$

With the Kronecker matrix product, (2.4) can be rewritten in the following form:

$$ \overrightarrow{X}(t+1)=\widehat{A}\overrightarrow{X}, $$
(2.5)

where \(\overrightarrow{X}(t)\) denotes the \(n(m+1)\)-dimensional column vector

$$\overrightarrow{X}(t)= \bigl[X_{1,1}(t),\ldots,X_{1,n}(t), \ldots ,X_{1,n(m+1)}(t),\ldots,X_{n(m+1),n(m+1)}(t) \bigr]' $$

and \(\widehat{A}\in R^{n^{2}(m + 1)^{2}\times n^{2}(m+1)^{2}}\) has the form \(\widehat{A}=(\overline{F}+\overline{M}\overline{K})\otimes (\overline{F}+\overline{M}\overline{K})+(\overline{G}+\overline {N}\overline{K})\otimes(\overline{G}+\overline{N}K)\).

Lemma 2.1

Let \(H_{n,m}\) be a \(n^{2}(m+1)^{2}\times\frac {n(m+1)[n(m+1)+1]}{2}\) matrix and

$$\operatorname{rank}(H_{n,m})=\frac {n(m+1)[n(m+1)+1]}{2}. $$

Then \(H'_{n,m}H_{n,m}\) is invertible.

Theorem 2.1

The trivial solution \(\overrightarrow{X}(t)=0\) of system (2.5) has asymptotical stabilization if and only if, for any \(Q>0\), there exists a positive-definite matrix \(P\in S^{\frac{n(m+1)[n(m+1)+1]}{2}}\) such that P is a solution of the following Lyapunov equation:

$$P-\theta(H_{n,m})'P\theta(H_{n,m})=Q, $$

where \(\theta(H_{n,m})=[H'_{n,m}H_{n,m}]^{-1}H'_{n,m}[(\overline {F}+\overline{M}\overline{K})\otimes(\overline{F}+\overline {M}\overline{K})+(\overline{G}+\overline{N}\overline{K})\otimes (\overline{G}+\overline{N}\overline{K})]H_{n}\).

Proof

If we set \(X(t)=E\overline{x}(t)\overline{x}'(t)\), \(X(t)\) satisfies

$$ \left \{ \begin{array}{@{}l} X(t+1)=(\overline{F}+\overline{M}\overline{K})X(t)(\overline {F}+\overline{M}\overline{K})'+(\overline{G} +\overline{N}\overline{K})X(t)(\overline{G}+\overline{N}\overline {K})',\\ X(0)=\overline{x}(0)\overline{x}'(0) \in R^{n(m+1)},\quad t\in N. \end{array} \right . $$
(2.6)

Since the matrix \(X(\cdot)\) is real symmetric, (2.6) is a linear matrix equation with \(\frac{n(m+1)[n(m+1)+1]}{2}\) different variables, i.e., it is in fact a \(\frac{n(m+1)[n(m+1)+1]}{2}\)th-order linear system. Assume we define a map \(\widetilde{\mathcal{L}}\) from \(S^{n(m+1)}\) to \(C^{\frac{n(m+1)[n(m+1)+1]}{2}}\) as follows:

  • for any \(Y=(Y_{ij})_{n(m+1)\times n(m+1)} \in S^{n(m+1)}\), set

    $$\widetilde{Y} =\widetilde{\mathcal{L}}(\widetilde{Y})=(Y_{11},\ldots ,Y_{1n(m+1)},\ldots, Y_{n(m+1)-1,n(m+1)-1}, Y_{n(m+1)-1,n(m+1)}; Y_{n(m+1)n(m+1)})'. $$

Then there exists a unique matrix \(\Theta_{1}(H_{n,m})\in R^{\frac {n(m+1)[n(m+1)+1]}{2}\times\frac{n(m+1)[n(m+1)+1]}{2}}\), by Lemma 2.1 and the H-representation of [10], such that (2.6) is equivalent to

$$ \left \{ \begin{array}{@{}l} \widetilde{X}(t+1)=\widetilde{\mathcal{L}}(\mathcal {L}_{\overline{K}}(X))=\theta(H_{n,m})\widetilde{X}(t),\\ \widetilde{X}(0)=[H'_{n,m}H_{n,m}]^{-1}H'_{n,m}\overrightarrow{X}(0), \end{array} \right . $$
(2.7)

where \(\Theta_{1}(H_{n,m})=[H'_{n,m}H_{n,m}]^{-1}H'_{n,m}[(\overline {F}+\overline{M}\overline{K})\otimes(\overline{F}+\overline {M}\overline{K})+(\overline{G}+\overline{N}\overline{K})\otimes (\overline{G}+\overline{N}\overline{K})]H_{n,m}\), \(\widetilde{X}(t)\in R^{\frac{n(m+1)[n(m+1)+1]}{2}}\) due to \(X(t)=E\overline{x}(t)\overline {x}'(t)\) being real positive semidefinite. It is obvious, since system (2.7) is deterministic. The statement of the theorem can be established in a way that is standard for the method of Lyapunov functions for deterministic difference equations, namely, considering an \(\frac{n(m+1)[n(m+1)+1]}{2}\)-parameter Lyapunov function as a quadratic form

$$ V \bigl(\widetilde{X}(t) \bigr)=\widetilde{X}' P \widetilde{X}, \quad P\in S^{\frac {n(m+1)[n(m+1)+1]}{2}}. $$
(2.8)

The role of the parameters is played by \(\frac {n(m+1)[n(m+1)+1][n(m+1)[n(m+1)+1]+2]}{8}\) elements of the positive-definite matrix P, which should be determined. □

From the proofs of Theorem 2.1, we easily get the following result.

Corollary 2.1

The trivial solution \(x(t)=0\) of system (2.1) being asymptotically mean square stabilizable is equivalent to one of the following results:

  1. (1)

    The trivial solution \(\overline{x}(t)=0\) of system (2.3) is asymptotically mean square stabilizable.

  2. (2)

    The trivial solution \(\overrightarrow{X}(t)=0\) of system (2.5) is asymptotically stabilizable.

Similar to Definition 5 of [8], we define ‘exact observability’ for stochastic time-delay difference systems as follows, which will be used in Section 3.

Definition 2.3

Consider the following linear difference system:

$$ \left \{ \begin{array}{@{}l} x(t+1)=F_{0}x(t)+G_{0}x(t)w(t) +\sum_{j=1}^{m}[F_{j}x(t-j)+G_{j}x(t-j)w(t)],\\ y(t)=C(x'(t),x'(t-1),\ldots,x'(t-m))', \quad t\in N. \end{array} \right . $$
(2.9)

We call (2.9) or \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) exactly observable, if \(y(t)=0\), a.s., \(t\in N \Rightarrow\overline{x}(0)=0\).

The following lemma extends Theorem 6 of [8] to a time-delay version by the H-representation approach in [10].

Lemma 2.2

(PBH Criterion)

\((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is exactly observable if and only if there does not exist \(0\neq Z\in S^{n(m+1)}\) such that

$$ \overline{F}'Z\overline{F}+\overline{G}'Z\overline{G}= \lambda Z, \qquad CZ=0. $$
(2.10)

Proof

If we set \(X(t)=E\overline{x}(t)\overline{x}'(t)\), \(X(t)\) satisfies the following difference equation:

$$ \left \{ \begin{array}{@{}l} X(t+1)=\overline{F}X(t)\overline{F}'+\overline{G}X(t)\overline{G}',\quad t\in N,\\ X(0)=\overline{x}(0)\overline{x}'(0) \in R^{n(m+1)\times n(m+1)}. \end{array} \right . $$
(2.11)

Since \(X(\cdot)\) is real symmetric, (2.11) is a linear matrix equation with \(\frac{n(m+1)[n(m+1)+1]}{2}\) different variables, i.e., it is in fact an \(\frac{n(m+1)[n(m+1)+1]}{2}\)th-order linear system. On the other hand, from Definition 2.3, \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is exactly observable if and only if for any arbitrary \(X(0)\neq0\), there exists a \(k_{0}\in N\) such that

$$ Y(k_{0})=E \bigl[y(k_{0})y'(k_{0}) \bigr]=CX(k_{0})C'\neq0. $$
(2.12)

In addition, since \(X(k)\geq0\) for any \(k\in N\), (2.12) is equivalent to

$$ 0\neq Y^{*}(k_{0})=CX(k_{0})\in R^{n^{2}(m+1)^{2}}, $$
(2.13)

which is equivalent to

$$0\neq\widetilde{Y}^{*}(k_{0})=(I\otimes C)H_{n,m} \widetilde {X}(k_{0})\in R^{n(m+1)\times n(m+1)}. $$

So (2.10) is exactly observable if and only if the deterministic system

$$ \left \{ \begin{array}{l}\widetilde{X}(t+1) =\widetilde{\mathcal{L}}(\mathcal {L}_{\overline{F},\overline{G}}(X)) =\theta(H_{n,m})\widetilde{X}(t),\\ \widetilde{Y}(t)= (I\otimes C)H_{n,m}\widetilde{X}(t), \quad t\in N, \end{array} \right . $$
(2.14)

is completely observable, where

$$\Theta_{2}(H_{n,m})= \bigl[H'_{n,m}H_{n,m} \bigr]^{-1}H'_{n,m}(\overline {F}\otimes \overline{F}+\overline{G}\otimes\overline{G})H_{n,m} $$

and

$$\mathcal{L}_{\overline{F},\overline{G}}:X\in S^{n(m+1)}\rightarrow \overline{F}X \overline{F}' +\overline{G}X(t)\overline{G}'\in S^{n(m+1)}. $$

By the PBH criterion for complete observability, (2.14) is completely observable if and only if there does not exist an eigenvector \(\xi\neq0\) in \(\frac{n(m+1)[n(m+1)+1]}{2}\) dimensions such that

$$ \widetilde{\mathcal{L}}(\mathcal{L}_{F,G})\xi=\theta(H_{n,m}) \xi=\lambda \xi, \quad (I\otimes C)H_{n,m}\xi=0. $$
(2.15)

Obviously, (2.15) is equivalent to the nonexistence of \(0\neq Z\in S^{\frac{n(m +1)[n(m +1)+1]}{2}}\) satisfying (2.10). □

Theorem 2.2

A nonsingular transformation does not change the exact detectability of the original systems.

Proof

Assume \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is exactly detectable; arbitrarily choose a nonsingular matrix T, let \(x(t)=T\xi(t)\); a transformed system takes the following form of \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\):

$$ \left \{ \begin{array}{@{}l} \xi(t+1)=T^{-1}F_{0}T\xi(t)+T^{-1}G_{0}T\xi(t)w(t) \\ \hphantom{\xi(t+1)=}{}+\sum_{j=1}^{m}[T^{-1}F_{j}T\xi(t-j)+T^{-1}G_{j}T\xi(t-j)w(t)],\\ y(t)=C\overline{T}(\xi'(t),\xi'(t-1),\ldots,\xi'(t-m))',\quad t\in N. \end{array} \right . $$
(2.16)

Here

$$\overline{T}= \begin{pmatrix} T& 0&\cdots&0&0\\ 0 &T&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &T \end{pmatrix}. $$

If we let

$$\overline{F}= \begin{pmatrix} F_{0}& F_{1}&\cdots&F_{m-1}&F_{m}\\ I &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&I &0 \end{pmatrix},\qquad \overline{G}= \begin{pmatrix} G_{0}& G_{1}&\cdots&G_{m-1}&G_{m}\\ 0 &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix}, $$

then system (2.16) becomes

$$ \left \{ \begin{array}{@{}l} \overline{\xi}(t+1)=\overline{T}^{-1}\overline {F}\overline{T}\overline{\xi}(t)+\overline{T}^{-1}\overline{G}\overline {T}\overline{\xi}(t)w(t),\\ y(t)=C\overline{T}(x'(t),x'(t-1),\ldots,x'(t-m))', \quad t\in N. \end{array} \right . $$
(2.17)

We shall show that (2.17) is also exactly detectable. Otherwise, by Lemma 2.2, there does not exist \(0\neq Z\in S^{n(m+1)}\) such that

$$ \bigl(\overline{T}^{-1}\overline{F}\overline{T} \bigr)Z \bigl( \overline{T}^{-1}\overline {F}\overline{T} \bigr)'+ \bigl( \overline{T}^{-1} \overline{G}\overline{T} \bigr)Z \bigl( \overline{T}^{-1}\overline{G}\overline {T} \bigr)'=\lambda Z,\qquad C\overline{T}Z=0. $$
(2.18)

Pre- and post-multiplying (2.10) by T and \(T'\), respectively, yields

$$ \overline{F}\overline{T}Z\overline{T}'\overline{F}'+ \overline{G}\overline{T}Z\overline{T}'\overline{G}'= \lambda Z,\qquad C\overline{T}Z\overline{T}'=0. $$
(2.19)

If we set \(X(t)=\overline{T}Z\overline{T}'\), then from Lemma 2.2, we know that (2.19) contradicts the exact detectability of \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\). □

Lemma 2.3

If \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is exactly observable, then \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j})\) is asymptotically mean square stable if and only if the Lyapunov-type equation

$$ -P+\overline{F}'P\overline{F}+\overline{G}'P \overline{G}+C'C =0 $$
(2.20)

has a solution \(P>0\).

Proof

Necessity part. If \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j})\) is asymptotically mean square stable, from the method of Lyapunov functions for linear stochastic difference equations, (2.20) has a unique solution \(P\geq0\). Now we show \(P>0\). Otherwise, there exists \(\overline{x}_{0}\neq0\) such that \(P\overline{x}_{0}=0\). We obtain, for \(T\in N\),

$$\begin{aligned} 0\leq{}& Eyy'=\sum^{T}_{t=0}E \bigl\| C\overline {x}(t)\bigr\| ^{2} \\ ={}&\sum^{T}_{t=0}E \bigl[ \overline{x}'(t) \bigl(-P+\overline {F}'P\overline{F}+ \overline{G}'P \overline{G}+C'C \bigr)\overline{x}(t) \bigr] \\ &{}+\overline{x}'_{0}P\overline{x}_{0}-E \overline {x}'(T+1)P\overline{x}(T+1) \\ ={}&{-}E\overline{x}'(T+1)P\overline{x}(T+1)\leq0, \end{aligned}$$

from which follows \(y(t)=C(x'(t),x'(t-1),\ldots ,x'(t-m))'=C\overline{x}(t)=0\), a.s., \(t\in N_{T}\). Together with the exact observability of \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\), we obtain \(\overline{x}_{0}=0\), which contradicts \(\overline{x}_{0}\neq 0\). So \(P>0\).

Sufficiency part. Assume \(P>0\) is a solution to (2.20). Let \(V(\overline{x}(t))= E\overline{x}'(t)P\overline{x}(t)\), then we have

$$V \bigl(\overline{x}(t) \bigr)=\overline{x}'(0)P\overline{x}(0)- \sum^{t-1}_{j=0}E\bigl\| C\overline{x}(j) \bigr\| ^{2}, $$

which indicates that \(V(\overline{x}(t))\) is monotonically decreasing and bounded from below with respect to t, so \(\lim_{t\rightarrow+\infty} V(\overline{x}(t))\) exists. The rest of the proof proceeds along the lines of Theorem 6 of [8] and is omitted. □

Corollary 2.2

If \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is exactly observable, then the Lyapunov-type equation (2.20) has at most one positive definite solution.

Proof

If (2.20) has a positive semidefinite solution \(P_{1}\geq0\), from the proof of Lemma 2.3, we find that under the condition of exact observability, \(P_{1}>0\). So \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid C)\) is asymptotically mean square stable. By Lemma 2 in [11], (2.20) admits a unique positive semidefinite solution \(P_{1}\). □

3 Stochastic time-delay LQ control

In this section, under the assumptions of stabilization and exact observability, we investigate the problem of the existence of the optimal control law and optimal value of stochastic time-delay difference systems.

Considering the following linear stochastic system with time-delays:

$$ \left \{ \begin{array}{@{}l} x(t+1)=F_{0}x(t)+M_{0}u(t)+(G_{0}x(t)+N_{0}u(t))w(t)+\sum_{j=1}^{m}[F_{j}x(t-j)\\ \hphantom{x(t+1)=}{}+M_{j}u(t-j)+(G_{j}x(t-j)+N_{j}u(t-j))w(t)], \quad t\in N,\\ y(t)=C(x'(t),x'(t-1),\ldots,x'(t-m))', \qquad x(k)=\varphi(k)\in R^{n},\quad k\in[-m, 0]. \end{array} \right . $$
(3.1)

For the linear stochastic time-delay controlled system (3.1), we define the admissible control input set

$$u_{ad}= \bigl\{ u(t)\in l^{2}_{w} \bigl(N,R^{l} \bigr):u(t)\mbox{ is asymptotical mean square stabilizing control} \bigr\} $$

with the associated cost

$$ J(\overline{x}_{0},\overline{u})=\sum_{t=0}^{\infty}E \bigl[\overline {x}'(t)Q\overline{x}(t)+\overline{u}'(t)R \overline{u}(t) \bigr], $$
(3.2)

where \(Q\geq0\), \(R>0\). The LQ optimal control problem is to find a control \(u^{*}\in u_{ad}\) called the optimal control such that

$$J \bigl(\overline{x}_{0}, \overline{u}^{*} \bigr)=V( \overline{x}_{0})= \inf_{u\in u_{ad}}J(\overline{x}_{0}, \overline{u}). $$

We call \(x(t)\) corresponding to \(\overline{u}^{*}(t)\) the optimal trajectory, and \(V(\overline{x}_{0})\) is the optimal cost value.

Theorem 3.1

Assume that \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}M_{j},\sum_{j=0}^{m}G_{j},\sum_{j=0}^{m}N_{j})\) is asymptotically mean square stabilizable, and \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid Q^{\frac{1}{2}})\) is exactly observable. Then the GARE

$$ \left \{ \begin{array}{@{}l} P=\overline{F}'P\overline{F}+\overline{G}'P\overline {G}+Q\\ \hphantom{P=}{}-(\overline{F}'P\overline{M} +\overline{G}'P\overline{N})(R+\overline{M}'P\overline{M}+\overline {N}'P\overline{N})^{-1}(\overline{F}'P\overline{M}+\overline {G}'P\overline{N})',\\ R+\overline{M}'P\overline{M}+\overline{N}'P\overline{N}>0 \end{array} \right . $$
(3.3)

has a solution \(P>0\), which is the unique nonnegative definite solution of (3.3).

Proof

Let \(\overline{x}(t)=[x'(t),x'(t-1),\ldots ,x'(t-m)]'\), \(\overline{u}(t)=[u'(t),u'(t-1),\ldots,u'(t-m)]'\). \((\sum_{j=0}^{m}F_{j}, \sum_{j=0}^{m}M_{j},\sum_{j=0}^{m}G_{j},\sum_{j=0}^{m}N_{j})\) can be written in the form of an equivalent stochastic system of dimension \(n(m+1)\), namely,

$$\overline{x}(t+1)=\overline{F}\overline{x}(t)+\overline{M}\overline {u}(t)+ \bigl(\overline{G}\overline{x}(t)+\overline{N}\overline{u}(t) \bigr)\omega(t), $$

where

$$\begin{aligned}& \overline{F}= \begin{pmatrix} F_{0}& F_{1}&\cdots&F_{m-1}&F_{m}\\ I &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&I &0 \end{pmatrix},\qquad \overline{G}= \begin{pmatrix} G_{0}& G_{1}&\cdots&G_{m-1}&G_{m}\\ 0 &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix}, \\& \overline{M}= \begin{pmatrix} M_{0}&M_{1}&\cdots&M_{m-1}&M_{m}\\ 0 &0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix},\qquad \overline{N}= \begin{pmatrix} N_{0}& N_{1}&\cdots&N_{m-1}&N_{m}\\ 0&0&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &0 \end{pmatrix}. \end{aligned}$$

For a control input \(\overline{u}(t)=\overline{K}\overline{x}(t)\), \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}M_{j},\sum_{j=0}^{m}G_{j},\sum_{j=0}^{m}N_{j})\) becomes

$$\overline{x}(t+1)=(\overline{F}+\overline{M}\overline{K})\overline {x}(t)+( \overline{G}+\overline{N}\overline{K})\overline{x}(t)\omega(t), $$

where

$$\overline{K}= \begin{pmatrix} K&0&\cdots&0&0\\ 0 &K&\cdots&0 &0\\ \vdots& \vdots&\cdots&\vdots&\vdots\\ 0 &0&\cdots&0 &K \end{pmatrix}. $$

Since \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}M_{j},\sum_{j=0}^{m}G_{j},\sum_{j=0}^{m}N_{j})\) is asymptotically mean square stabilizable, (3.3) has a stabilizable solution \(P\geq0\). Since \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid Q^{\frac{1}{2}})\) is exactly observable, by Lemma 2.3, \(P>0\). By the uniqueness of stabilizable solution of (3.3), we know that (3.3) has only one positive-definite solution. □

Corollary 3.1

Assume that \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid Q^{\frac{1}{2}})\) is exactly observable. Then system \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j})\) is asymptotically mean square stable if and only if the Lyapunov equation

$$P=\overline{F}'P\overline{F}+\overline{G}'P \overline{G}+Q $$

has a solution \(P>0\).

Lemma 3.1

In system \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j})\), \(t \in N\), \(P \in S^{n(m+1)}\), and \(\overline{x}(0) \in R^{n(m+1)}\), we have

$$ \sum_{t=0}^{T}E \left[ \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix}'Q(P) \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix} \right]=E \bigl[ \overline{x}'(T+1)P\overline{x}(T+1) \bigr]-\overline {x}'(0)P\overline{x}(0), $$
(3.4)

where

$$Q(P)= \begin{pmatrix} -P+\overline{F}'P\overline{F}+\overline {G}'P\overline{G}& \overline{F}'P\overline{M}+\overline{G}'P\overline {N}\\ \overline{M}'P\overline{F}+\overline{N}'P\overline{G}& \overline {M}'P\overline{M}+ \overline{N}'P\overline{N} \end{pmatrix}. $$

Proof

It can easily be derived by the following identity:

$$E \bigl[\overline{x}'(T+1)P\overline{x}(T+1) \bigr]- \overline{x}'(0)P\overline {x}(0)=\sum_{t=0}^{T}E \bigl[\overline{x}'(t+1)P\overline{x}(t+1)-\overline {x}'(t)P\overline{x}(t) \bigr] $$

and the fact that \(\overline{F}\overline{x}(t)+\overline {M}\overline{u}(t)\) and \(\overline{G}\overline{x}(t)+\overline{N}\overline{u}(t)\) are independent for \(w(t)\). □

Theorem 3.2

Assume that \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}M_{j},\sum_{j=0}^{m}G_{j},\sum_{j=0}^{m}N_{j})\) is asymptotically mean square stabilizable, and \((\sum_{j=0}^{m}F_{j},\sum_{j=0}^{m}G_{j}\mid Q^{\frac{1}{2}})\) is exactly observable. Then the optimal cost value is given by \(V(\overline{x}_{0})=\overline{x}'_{0}P_{1}\overline{x}_{0}\), where \(P_{1}>0\) is the unique feedback and stabilizing solution of (3.3), and the optimal control is uniquely determined by \(\overline{u}(t)=\overline{K}\overline{x}(t)\) where \(\overline {K}=-(R+\overline{M}'P_{1}\overline{M}+\overline{N}'P_{1}\overline {N})^{-1}(\overline{F}'P_{1}\overline{G}+\overline{M}'P_{1}\overline{N})'\).

Proof

Note that GARE (3.3) can be written as

$$ -P+(\overline{F}+\overline{M}\overline{K})'P(\overline{F}+\overline {M}\overline{K})+(\overline{G}+\overline{N}\overline{K})'P \overline{G}+\overline{N}\overline{K}+Q+\overline{K}'R\overline{K}=0. $$
(3.5)

From Lemma 2.3, we know (3.5) has a stabilizing solution \(P_{1}>0\), so \(\lim_{t\rightarrow+\infty}E\|\overline{x}(t)\|=0\) when \(\overline{K}=-(R+\overline{M}'P_{1}\overline{M}+\overline {N}'P_{1}\overline{N})^{-1}(\overline{F}'P_{1}\overline{G}+\overline {M}'P_{1}\overline{N})'\overline{x}(t)\).

From Lemma 3.1,

$$\begin{aligned} &\sum_{t=0}^{\infty} E \bigl[ \overline{x}'(t)Q\overline {x}(t)+\overline{u}'(t)R \overline{u}(t) \bigr] \\ &\quad=\lim_{T\rightarrow\infty} \sum_{t=0}^{T} E \bigl[\overline{x}'(t)Q\overline{x}(t)+\overline {u}'(t)R\overline{u}(t) \bigr]+\overline{x}'(0)P_{1} \overline {x}(0)-E \bigl[\overline{x}'(T+1)P_{1} \overline{x}(T+1) \bigr] \\ &\qquad{}+E \left[ \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix}'Q(P_{1}) \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix} \right] \\ &\quad=\overline{x}_{0}'P_{1} \overline{x}_{0}-\lim_{T\rightarrow\infty}E \bigl[ \overline{x}'(T+1)P_{1}\overline{x}(T+1) \bigr] +E \left[ \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix}'H(P_{1}) \begin{pmatrix} \overline{x}(t)\\ \overline{u}(t) \end{pmatrix} \right] \\ &\quad= \overline{x}'_{0}P_{1} \overline{x}_{0}-\lim_{T\rightarrow\infty}E \bigl[ \overline{x}'(T+1)P_{1}\overline{x}(T+1) \bigr]+\sum _{t=0}^{\infty}E \bigl[\overline{u}(t)- \overline{K}\overline {x}(t) \bigr]'R \bigl[\overline{u}(t)- \overline{K}\overline{x}(t) \bigr] \\ &\qquad{}+\sum_{t=0}^{\infty} E \bigl[ \overline{x}'(t)\widetilde {H}\overline{x}(t) \bigr], \end{aligned}$$

where

$$H(P_{1}) = \begin{pmatrix} E& L\\ L' & T \end{pmatrix} $$

with

$$\begin{aligned}& \widetilde{H}= -P_{1}+\overline{F}'P_{1} \overline{F}+\overline {G}'P_{1}\overline{G}+Q- \overline{F}'P_{1}\overline{M} \bigl(R+\overline {M}'P_{1}\overline{M} \bigr)^{-1} \overline{M}'P_{1}\overline{F}, \\& E=-P_{1}+ \overline{F}'P_{1}\overline{F}+ \overline{G}'P_{1}\overline {G}+Q,\qquad L= \overline{F}'P_{1}\overline{M}, \\& T=\overline{M}'P_{1}\overline{M}+R, \qquad \overline{K}=-T^{-1}\overline {F}'P_{1} \overline{M}. \end{aligned}$$

Hence we have

$$\min_{u\in l^{2}_{w}(N,R^{n_{u}} )}\sum_{t=0}^{\infty} E \bigl[\overline{x}'(t)Q\overline{x}(t)+\overline{u}'(t)R \overline {u}(t) \bigr]=\overline{x}_{0}'P_{1} \overline{x}_{0}, $$

and the optimal control is uniquely determined by \(\overline{u}(t)=\overline{K}\overline{x}(t)\) where \(\overline {K}=-(R+\overline{M}'P_{1}\overline{M}+\overline{N}'P_{1}\overline {N})^{-1}(\overline{F}'P_{1}\overline{G}+\overline{M}'P_{1}\overline{N})'\). □

Example 3.1

Consider the following stochastic time-delay system:

$$\left \{ \begin{array}{@{}l} x(t+1)=a_{0}x(t)+m_{0}u(t)+a_{1}x(t-1)+m_{1}u(t-1)+gx(t)\omega (t)+nu(t)\omega(t),\\ y(t)=C(x(t),x(t-1))',\qquad x(k)=\varphi(k), \quad k=0,-1,\ldots,-m, t\in N. \end{array} \right . $$

Let \(\overline{x}(t)=(x(t), x(t-1))'\) and \(\overline{u}(t)=(u(t), u(t-1))'\). The above system can be written in the form of an equivalent stochastic system

$$\overline{x}(t+1)= \begin{pmatrix} a_{0}&a_{1}\\ 1&0 \end{pmatrix} \overline{x}(t)+ \begin{pmatrix} m_{0}&m_{1}\\ 0&0 \end{pmatrix} \overline{u}(t) + \begin{pmatrix} g&0\\ 0&0 \end{pmatrix} \overline{x}(t) \omega(t)+ \begin{pmatrix} n&0\\ 0&0 \end{pmatrix} \overline{u}(t)\omega(t). $$

Here

$$\overline{F}= \begin{pmatrix} a_{0}&a_{1}\\ 1&0 \end{pmatrix},\qquad \overline{G}= \begin{pmatrix} g&0\\ 0&0 \end{pmatrix}, \qquad \overline{M}= \begin{pmatrix} m_{0}&m_{1}\\ 0&0 \end{pmatrix},\qquad \overline{N}= \begin{pmatrix} n&0\\ 0&0 \end{pmatrix}. $$

Now we solve the following general algebraic Ricatti equation:

$$\left \{ \begin{array}{@{}l} P=\overline{F}'P\overline{F}+\overline{G}'P\overline {G}+CC'-(\overline{F}'P\overline{M} +\overline{G}'P\overline{N}) (R+\overline{M}'P\overline{M}+\overline {N}'P\overline{N})^{-1}(\overline{F}'P\overline{M}+\overline {G}'P\overline{N})',\\ R+\overline{M}'P\overline{M}+\overline{N}'P\overline{N}>0. \end{array} \right . $$

Using Matlab, solving the stabilizing solution of the above GARE, i.e., solving the optimal solution of the following SDP problem:

$$\begin{aligned} &\max \mathcal{T}r(P) \\ &\mbox{subject to } \begin{pmatrix} -P+\overline{F}'P\overline{F}+\overline{G}'P\overline{G}+CC' & \overline {F}'P\overline{M}+\overline{G}'P\overline{N}\\ \overline{M}'P\overline{F}+\overline{N}'P\overline{G}& R+\overline {M}'P\overline{M}+\overline{N}'P\overline{N} \end{pmatrix}\geq0,\quad P\geq0, \end{aligned}$$

we can get the following optimal solution (actually, from Theorem 10 in [6], we know the optimal solution is the stabilizing solution P of the above GARE), with the optimal control:

$$\overline{u}(t)=- \bigl(R+\overline{M}'P\overline{M}+ \overline{N}'P\overline {N} \bigr)^{-1} \bigl( \overline{F}'P\overline{M}+\overline{G}'P\overline {N} \bigr)'\overline{x}(t). $$

Specially put \(a_{0}=0\), \(a_{1}=1\), \(m_{0}=0\), \(m_{1}=g=n=1\), \(R=I\), and \(C=\bigl ( {\scriptsize\begin{matrix} 1&0\cr 0&1 \end{matrix}} \bigr )\), we get the stabilizing solution of the above GARE

$$P= \begin{pmatrix} 3.5615&0\\ 0&1.7808 \end{pmatrix} $$

and the optimal control

$$\overline{u}(t)= \begin{pmatrix} -0.7808&0\\ 0&-0.7808 \end{pmatrix}\overline{x}(t). $$

References

  1. Kalman, RE: Contributions to the theory of optimal control. Bol. Soc. Mat. Mexicana 5, 102-119 (1960)

    MathSciNet  Google Scholar 

  2. Anderson, BDO, Moore, JB: Optimal Control Linear Quadratic Methods. Prentice-Hall, New York (1989)

    Google Scholar 

  3. Lewis, FL: Optimal Control. Wiley, New York (1986)

    MATH  Google Scholar 

  4. Wonham, WM: On a matrix Riccati equation of stochastic control. SIAM J. Control 6, 312-326 (1968)

    Article  MATH  MathSciNet  Google Scholar 

  5. Chen, S, Li, X, Zhou, XY: Stochastic linear quadratic regulators with indefinite control weight costs. SIAM J. Control Optim. 36, 1685-1702 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Ait Rami, M, Chen, X, Zhou, XY: Discrete time indefinite LQ control with state and control dependent noises. J. Glob. Optim. 23, 245-265 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  7. Song, XM, Zhang, HS, Xie, LH: Stochastic linear quadratic regulation for discrete-time linear systems with input delay. Automatica 45, 2067-2073 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  8. Zhang, W, Chen, BS: On stabilizability and exact observability of stochastic systems with their applications. Automatica 40, 87-94 (2004)

    Article  MATH  Google Scholar 

  9. Chen, BS, Zhang, W: Stochastic \(H_{2}/H_{\infty}\) control with state-dependent noise. IEEE Trans. Autom. Control 49, 45-57 (2004)

    Article  Google Scholar 

  10. Zhang, W, Chen, BS: H-Representation and applications to generalized Lyapunov equations and linear stochastic systems. IEEE Trans. Autom. Control 57, 3009-3022 (2012)

    Article  Google Scholar 

  11. Huang, Y, Zhang, WH, Zhang, HS: The infinite horizon linear quadratic optimal control for discrete-time stochastic systems. Asian J. Control 10, 608-615 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grants 61174078 and 61201430.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Chen.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

GL performed all the steps of proof in this research and also wrote the paper. MC suggested many good ideas that made this paper possible and helped to draft the first manuscript. All authors read and approved the final manuscript.

Rights and permissions

Open Access This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, G., Chen, M. Infinite horizon linear quadratic optimal control for stochastic difference time-delay systems. Adv Differ Equ 2015, 14 (2015). https://doi.org/10.1186/s13662-014-0342-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13662-014-0342-1

Keywords