Skip to main content

Theory and Modern Applications

Backward-forward linear-quadratic mean-field Stackelberg games

Abstract

This paper studies a controlled backward-forward linear-quadratic-Gaussian (LQG) large population system in Stackelberg games. The leader agent is of backward state and follower agents are of forward state. The leader agent is dominating as its state enters those of follower agents. On the other hand, the state-average of all follower agents affects the cost functional of the leader agent. In reality, the leader and the followers may represent two typical types of participants involved in market price formation: the supplier and producers. This differs from standard MFG literature and is mainly due to the Stackelberg structure here. By variational analysis, the consistency condition system can be represented by some fully-coupled backward-forward stochastic differential equations (BFSDEs) with high dimensional block structure in an open-loop sense. Next, we discuss the well-posedness of such a BFSDE system by virtue of the contraction mapping method. Consequently, we obtain the decentralized strategies for the leader and follower agents which are proved to satisfy the ε-Nash equilibrium property.

1 Introduction

Recently, the dynamic optimization of a (linear) large-population system has attracted extensive research attention from academic communities. Its most significant feature is the existence of numerous insignificant agents, denoted by \(\{\mathcal{A}_{i}\}_{i=1}^{N}\), whose dynamics and (or) cost functionals are coupled via their state-average. To design low-complexity strategies for a large-population system, one efficient method is mean-field game (MFG) which enables us to derive the decentralized strategies. We recall that there is a large body of related works on MFG. Since the independent works by Huang, Caines, and Malhamé [11, 12] and Lasry and Lions [1315], MFG theory and its applications have enjoyed rapid growth. Some related further developments on MFG theory may include Bardi [1], Bensoussan, Frehse, and Yam [4], Carmona and Delarue [6], Garnier, Papanicolaou, and Yang [8], Guéant, Lasry, and Lions [9], and the references therein.

(Single leader-follower game) In the case where \(N=1\), only a single follower with one leader, our problem is reduced to the classical single-leader and single-follower game. The leader-follower (Stackelberg) game was proposed in 1934 by H. von Stackelberg [23] when he defined the concept of a hierarchical solution for markets in which some firms have more power than of others and thus dominate them. This solution concept is termed the Stackelberg equilibrium. An early study of stochastic Stackelberg differential games (SSDGs) was conducted by Basar [2]. Another relevant study was performed by Yong [26], where an LQ leader-follower stochastic differential game (SDG) was introduced and studied in its open-loop information case. The setting in [26] is general: its coefficients of system and cost functionals may be random, the controls enter the diffusion term of state dynamics, and the weight matrices for controls in cost functionals are not necessarily positive definite. In a similar but nonlinear setting, Bensoussan, Chen, and Sethi [3] obtained the global maximum principles for both open-loop (OL) and closed-loop (CL) SSDGs, but the diffusion term did not contain the controls. This simplifies the related analysis to a certain extent. In the special LQ setting, the solvability of related Riccati equations is also discussed, and the state feedback Stackelberg equilibrium is thus obtained.

So far, almost all of these related research studies for mean-field Stackelberg games have been based on the SDEs system state. To the best of our knowledge, the first paper that does some research on the BSDEs system state is that by Huang, Wang, and Wu [10]. This paper can be regarded as the follow-up work of that one. We formulate more general LQ mean-field Stackelberg games with BSDEs system state. Unlike the forward SDE with given initial condition, the terminal condition is pre-specified in the BSDE as a priori, and its solution becomes an adapted process pair. Linear BSDEs were first introduced by Bismut [5], and the general nonlinear BSDE was first studied in Pardoux and Peng [18]. The BSDE has been applied broadly in many fields such as mathematical economics and finance, decision making, and management science. One example is the representation of stochastic differential recursive utility by a class of BSDE (Wang and Wu [24], etc.). A BSDE coupled with an SDE in their terminal conditions formulates the forward-backward stochastic differential equation (FBSDE). The FBSDE has also been well studied, and the interested readers may refer to [7, 2528].

The modeling of the leader agent by a BSDE and follower agents by a forward SDE is well motivated and can be illustrated by the following example. The government announces the target of interest-adjusted in future five years today. The related banks and individuals will try to find the optimal investment plan based on the announcement. However, the government learns that the related banks and individuals will carry out their own investment plans according to its announcement. So the government could adjust its announcement to optimize its own goal. This is a typical mean-field Stackelberg game with the leader agent modeled by a BSDE and follower agents modeled by a forward SDE. The model setting has its own strengths in applications. In practice, the leader always sets a goal or target for the group, and the followers in the group will find the optimal plan to achieve the goal. The cost functional they consider may differ and the dynamics of the leader becomes a BSDE and the dynamics of the followers are a series of SDEs. The traditional paper studies the leader-follower problems that are all based on SDEs dynamics and cannot represent this kind of cases in practice.

The modeling of backward-leader and forward-followers will yield a large-population system with backward-forward stochastic differential equation (BFSDE), which is structurally different to FBSDE in the following aspects. First, the forward and backward equations will be coupled in their initial rather than terminal conditions. Second, unlike FBSDE, there is no feasible decoupling structure by standard Riccati equations, as addressed in Lim and Zhou (2001) [16]. This is mainly because some implicit constraints in the initial conditions should be satisfied in the possible decoupling.

The introduction of BFSDE also brings some technical differences to its MFG studies. It will bring a more complicated coupled structure to consistency condition derived in our current backward-leader and forward-followers setup. The standard procedure of MFG mainly consists of the following steps:

Step 1: Fix the decision of the leader, denoted respectively by \((x_{0}, u_{0})\). Given such fixed quantities \((x_{0}, u_{0})\), introduce and solve the mean-field subgame faced by all followers which are also competitive inside their interaction cycle. For such a subgame, an auxiliary problem can be constructed and some decentralized responses of the followers can be derived, the related mass limit response of the followers is denoted by \(\bar{x}=\bar{x}(x_{0}, u_{0})\).

Step 2: Given the response functional of followers , solve the decentralized stochastic control problem of the leader \(\mathcal{A}_{0}\), and denote the optimal solution pair by \((\bar{x}_{0},\bar{u}_{0})=(\bar{x}_{0}(\bar{x}),\bar{u}_{0}(\bar{x}))\).

Step 3: Derive the consistency condition (CC) system to specify ; then, all decentralized strategies for the leader and followers can sequentially be designed. An approximate Nash equilibrium can then be obtained.

The main contributions of this paper can be summarized as follows:

  • We formulate a general backward-leader and forward-followers LQ mean-field game. To some degree, it has some applications in reality.

  • We derive the CC system which is represented using a fully coupled mean-field-type backward-forward stochastic differential equation (BFSDE) in an open-loop case.

  • The existence and uniqueness of the related CC system is investigated in global solvability case.

The rest of this paper is organized as follows. Section 2 provides the problem formulation and presents some preliminary details. In Sect. 3, we introduce the auxiliary limiting LQG optimization problems for MFG analysis. In Sect. 4, we discuss the open-loop strategy of Stackelberg games. In Sect. 5, we determine the CC system based on an open loop, which provides fully coupled BFSDEs. Section 6 is devoted to verifying the approximate equilibrium of open-loop decentralized strategies.

2 Preliminaries and problem formulation

The following notations are used throughout this paper. Let \(\mathbb{R}^{n}\) denote the n-dimensional Euclidean space, \(\mathbb{R}^{n\times m}\) be the set of all \((n\times m)\) matrices, and let \(\mathcal{S}^{n}\) be the set of all \((n\times n)\) symmetric matrices. We denote the transpose by subscript , the inner product by \(\langle \cdot ,\cdot \rangle \), and the norm by \(|\cdot |\). For \(t\in [0,T]\) and Euclidean space \(\mathbb{H}\), we introduce the following function spaces:

$$ \begin{aligned} &L^{p}(t,T;\mathbb{H})= \biggl\{ \psi :[t,T] \rightarrow \mathbb{H}\Bigm| \int _{t}^{T} \bigl\vert \psi (s) \bigr\vert ^{p}\,\mathrm{d}s< \infty \biggr\} ,\quad 1\leq p< \infty , \\ &L^{\infty }(t,T;\mathbb{H})= \Bigl\{ \psi :[t,T]\rightarrow \mathbb{H}\bigm| \mathop{\operatorname{esssup}}_{s\in [t,T]} \bigl\vert \psi (s) \bigr\vert < \infty \Bigr\} , \\ &C \bigl([t,T];\mathbb{H}\bigr)= \bigl\{ \psi :[t,T]\rightarrow \mathbb{H}\mid \psi ( \cdot ) \mbox{ is continuous} \bigr\} , \end{aligned}$$

and the spaces of process or random variables on a given filtrated probability space:

$$ \begin{aligned} &L^{2}_{\mathcal{F}_{t}}(\Omega ;\mathbb{H})= \bigl\{ \xi :\Omega \rightarrow \mathbb{H}\mid \xi \mbox{ is } \mathcal{F}_{t} \mbox{-measurable}, \mathbb{E}\bigl[ \vert \xi \vert ^{2} \bigr]< \infty \bigr\} , \\ &L^{2}_{\mathcal{F}}(t,T;\mathbb{H})= \biggl\{ \psi :[t,T]\times \Omega \rightarrow \mathbb{H}\Bigm| \psi (\cdot ) \mbox{ is } \mathcal{F}_{t} \mbox{-progressively measurable}, \\ &\hphantom{L^{2}_{\mathcal{F}}(t,T;\mathbb{H})={}\ }\mathbb{E}\biggl[ \int _{t}^{T} \bigl\vert \psi (s) \bigr\vert ^{2}\,\mathrm{d}s \biggr]< \infty \biggr\} . \end{aligned}$$

On a given finite decision horizon \([0,T]\), let \((\Omega ,\mathcal{F},\{\mathcal{F}_{t}\}_{0\leqslant t\leqslant T}, \mathbb{P})\) be a complete filtered probability space on which a \((1+N)\)-dimensional standard Brownian motion \(\{W_{0}(t),W_{i}(t); 1\leqslant i\leqslant N\}_{0 \leqslant t \leqslant T}\) is defined. We define by \(\{\mathcal{F}_{t}\}_{0 \leqslant t \leqslant T}\) the natural filtration generated by \(\{W_{0}(\cdot ), W_{i}(\cdot ), x_{i0}; 1\leqslant i\leqslant N\}\) augmented by all the \(\mathbb{P}\)-null sets in \(\mathcal{F}\), it captures the full information of agents; \(\{\mathcal{F}_{t}^{w_{0}}\}_{0\leqslant t\leqslant T}\) is the natural filtration generated by \(\{W_{0}(\cdot )\}\) augmented by all the \(\mathbb{P}\)-null sets in \(\mathcal{F}\), it captures the information of the leader agent; \(\{\mathcal{F}_{t}^{w_{i}}\}_{0 \leqslant t \leqslant T}\) is the natural filtration generated by \(\{W_{i}(\cdot )\}\) augmented by all the \(\mathbb{P}\)-null sets in \(\mathcal{F}\), it captures the information of the ith follower agent; \(\{\mathcal{F}_{t}^{i}\}_{0 \leqslant t \leqslant T}\) is the natural filtration generated by \(\{W_{0}(\cdot ), W_{i}(\cdot )\}\) augmented by all \(\mathbb{P}\)-null sets in \(\mathcal{F}\). In this paper, we consider a large-population system involving \((1+N)\) individual agents (where N is sufficiently large), which represent two types of agents: leader agent \(\mathcal{A}_{0}\) and follower agents \(\{\mathcal{A}_{i}\}_{i=1}^{N}\). The dynamics of \(\mathcal{A}_{0}\), \(\{\mathcal{A}_{i}\}_{i=1}^{N}\) are given sequentially by the following controlled linear backward stochastic differential equations (BSDE, for short) and controlled linear forward stochastic differential equations (SDE or FSDE, for short), respectively.

$$ \mathcal{A}_{0}: \textstyle\begin{cases} \mathrm{d} x_{0}(t)= \{A_{0}x_{0}(t)+B_{0}u_{0}(t)+C_{0}z_{0}(t) \}\,\mathrm{d}t+z_{0}(t) \,\mathrm{d}W_{0}(t), \\ x_{0}(T)=\xi , \end{cases} $$
(2.1)

and

$$ \mathcal{A}_{i}: \textstyle\begin{cases} \mathrm{d} x_{i}(t)= \{Ax_{i}(t)+Bu_{i}(t)+Ex^{(N)}(t)+ \alpha x_{0}(t) \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} x_{i}(t)={}}{} + \{Cx_{i}(t)+Du_{i}(t)+Fx^{(N)}(t)+ \beta x_{0}(t) \}\,\mathrm{d}W_{i}(t), \\ x_{i}(0)=x_{i0},\quad i=1,2,\ldots ,N, \end{cases} $$
(2.2)

where \(\xi \in L^{2}_{\mathcal{F}_{T}^{w_{0}}}(\Omega ;\mathbb{R})\), \(x^{(N)}(t)=\frac{1}{N}\sum_{i=1}^{N}x_{i}(t)\) are called the state average or mean field term of all follower agents; \(x_{i0}\) is the initial value of \(\mathcal{A}_{i}\). In this paper, for simplicity, we assume the dimensions of state process and control process are both one-dimensional. Here, \(A_{0}\), \(B_{0}\), \(C_{0}\), A, B, C, D, E, F, α, β are scalar constants. The admissible control \(u_{0}\in \mathcal{U}_{0}\), \(u_{i}\in \mathcal{U}_{i}\), where

$$ \begin{aligned} &\mathcal{U}_{0}[0,T] \triangleq L^{2}_{\mathcal{F}^{w_{0}}}(0,T;\mathbb{R}), \\ &\mathcal{U}_{i}[0,T]\triangleq L^{2}_{\mathcal{F}}(0,T; \mathbb{R}), \quad i=1,2,\ldots ,N. \end{aligned}$$
(2.3)

Let \(\mathbf{u}=(u_{0},u_{1},\ldots ,u_{N})\) denote the set of all strategies of all \((1+N)\) agents; \(\mathbf{u}_{-0}=(u_{1},\ldots ,u_{N})\) the strategies except \(\mathcal{A}_{0}\); \(\mathbf{u}_{-i}=(u_{0},u_{1},\ldots ,u_{i-1},u_{i+1}, \ldots ,u_{N})\) the strategies except the ith agent \(\mathcal{A}_{i}\). Moreover, agents \(\mathcal{A}_{0}\) and \(\{\mathcal{A}_{i}\}_{1\leqslant i\leqslant N}\) are further coupled via their cost functionals \(\mathcal{J}_{0}\) and \(\mathcal{J}_{i}\) as follows:

$$ \mathcal{J}_{0}(u_{0}, \mathbf{u}_{-0})=\frac{1}{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x_{0}(t)-x^{(N)}(t) \bigr)^{2}+ \tilde{Q}x_{0}^{2}(t)+R_{0} u_{0}^{2}(t) \bigr]\,\mathrm{d}t+H_{0} x_{0}^{2}(0) \biggr\} $$
(2.4)

for \(\mathcal{A}_{0}\), where \(Q_{0}\geqslant 0\), \(\tilde{Q}\geqslant 0\), \(R_{0}>0\), \(H_{0}\geqslant 0\); and

$$ \mathcal{J}_{i}(u_{i}, \mathbf{u}_{-i})=\frac{1}{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl[Q \bigl(x_{i}(t)-x^{(N)}(t) \bigr)^{2}+R u_{i}^{2}(t) \bigr]\,\mathrm{d}t+H x_{i}^{2}(T) \biggr\} , $$
(2.5)

for \(\mathcal{A}_{i}\), \(1\leqslant i\leqslant N\), where \(Q\geqslant 0\), \(R>0\), \(H\geqslant 0\). We introduce the following assumption:

  1. (H1)

    The initial states \(x_{i0}\) are independent and identically distributed (iid, for short) with \(\mathbb{E}[x_{i0}]=x\), \(\mathbb{E}[|x_{i0}|^{2}]<+\infty \) for each \(i=1,\ldots ,N\), and also independent of \(\{W_{0}(t),W_{i}(t); 1\leqslant i\leqslant N\}\).

It follows that (2.1) admits a unique adapted solution for all \(u_{0}\in \mathcal{U}_{0}[0,T]\) (refer to Pardoux and Peng [18]). It is also well known that under (H1), (2.2) admits a unique adapted solution for all \(u_{i}\in \mathcal{U}_{i}[0,T]\), \(1\leqslant i\leqslant N\). Now, we can formulate the large population dynamic optimization problem.

Problem (I)

Find the optimal strategies \(\bar{\mathbf{u}}=(\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\), which satisfy

$$ \mathcal{J}_{i}(\bar{u}_{i},\bar{\mathbf{u}}_{-i})= \inf_{u_{i}\in \mathcal{U}_{i}} \mathcal{J}_{i}(u_{i},\bar{ \mathbf{u}}_{-i}), \quad 0\leqslant i\leqslant N, $$

where \(\bar{\mathbf{u}}_{-0}=(\bar{u}_{1},\ldots ,\bar{u}_{N})\), \(\bar{\mathbf{u}}_{-i}=(\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{i-1}, \bar{u}_{i+1}, \ldots ,\bar{u}_{N})\) for \(1\leqslant i\leqslant N\).

We notice that all agents are coupled not only in their state process but also in their cost functionals with state averages. Roughly speaking, the game to be studied is carried out as follows. First, the leader \(\mathcal{A}_{0}\) announces his strategy \(u_{0}(\cdot )\) and commits to fulfilling it. Next, the followers \(\mathcal{A}_{i}\) provide their best response accordingly to minimize their cost functionals \(\mathcal{J}_{i}(u_{i}(\cdot ),\mathbf{u}_{-i}(\cdot ))\). This reduces some best response functionals for the followers depending on the control law of the leader. With this functional in mind, before the announcement, the agent \(\mathcal{A}_{0}\) will design his best response to minimize his own cost functional \(\mathcal{J}_{0}(u_{0}(\cdot ),\mathbf{u}_{-0}(\cdot ))\). Notice the weak coupling among the agents in a large-population system, the above game problem is essentially a high-dimensional Stackelberg–Nash differential game. The influence of individual agents (leader or followers) on the population should be averaged out when population size tends to infinity.

3 The limiting optimal control problem

Let us introduce the auxiliary limiting LQG optimization problems. Firstly, as \(N\rightarrow +\infty \), we suppose that \(x^{(N)}(\cdot )\) can be approximated by an \(\mathcal{F}^{w_{0}}_{t}\)-measurable function \(\bar{x}(\cdot )\). Then the state process of the follower becomes

$$ \textstyle\begin{cases} \mathrm{d} x_{i}(t)= \{Ax_{i}(t)+Bu_{i}(t)+E \bar{x}(t)+\alpha x_{0}(t) \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} x_{i}(t)={}}{} + \{Cx_{i}(t)+Du_{i}(t)+F\bar{x}(t)+\beta x_{0}(t) \}\,\mathrm{d}W_{i}(t), \\ x_{i}(0)=x_{i0}, \quad i=1,2,\ldots ,N, \end{cases} $$
(3.1)

with the following auxiliary cost functionals:

$$ J_{i}(u_{i})= \frac{1}{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl[Q \bigl(x_{i}(t)- \bar{x}(t) \bigr)^{2}+R u_{i}^{2}(t) \bigr] \,\mathrm{d}t+H x_{i}^{2}(T) \biggr\} $$
(3.2)

for \(\mathcal{A}_{i}\), \(1\leqslant i\leqslant N\). Then, introduce the following auxiliary limiting LQG optimization problems for followers.

Problem (II)

For given \(x_{i0}\), \(\mathcal{F}^{w_{0}}_{t}\)-measurable functions \(\bar{x}(\cdot )\), and the control \(u_{0}(\cdot )\) of the leader agent \(\mathcal{A}_{0}\), find the optimal response functional \(\bar{u}_{i}[\cdot ]:\mathcal{U}_{0}[0,T]\times L^{2}_{\mathcal{F}^{w_{0}}}(0,T;\mathbb{R}) \rightarrow \mathcal{U}_{i}[0,T]\) of the following differential games among followers:

$$ J_{i} \bigl(x_{i0},\bar{x}(\cdot ),u_{0}( \cdot );\bar{u}_{i} \bigl[u_{0}(\cdot ), \bar{x}(\cdot ) \bigr] \bigr)= \inf_{u_{i}(\cdot )\in \mathcal{U}_{i}[0,T]}J_{i} \bigl(x_{i0}, \bar{x}(\cdot ),u_{0}(\cdot );u_{i}(\cdot ) \bigr). $$

The analysis of Problem (II) can be further decomposed into substeps using MFG theory.

Step 1 (SOC-F): Consider the Nash equilibrium response functional of Problem (II) for the representative follower agent denoted by \(\bar{u}_{i}[\cdot ,\cdot ]\). For given \(x_{i0}\), \(\mathcal{F}^{w_{0}}_{t}\)-measurable functions \(\bar{x}(\cdot )\), and the control \(u_{0}(\cdot )\) of the leader \(\mathcal{A}_{0}\), find an open-loop strategy \(\bar{u}_{i}(\cdot )=\bar{u}_{i}[u_{0}(\cdot ),\bar{x}(\cdot )]\in \mathcal{U}_{i}[0,T]\), \(1\leq i\leq N\). In other words, find the Nash equilibrium response functional \(\bar{u}_{i}[\cdot ,\cdot ]:\mathcal{U}_{0}[0,T]\times L^{2}_{\mathcal{F}^{w_{0}}}(0,T; \mathbb{R})\rightarrow \mathcal{U}_{i} [0,T]\) of the following Nash differential games among followers:

$$ J_{i} \bigl(x_{i0},\bar{x}(\cdot ),u_{0}( \cdot );\bar{u}_{i} \bigl[u_{0}(\cdot ), \bar{x}(\cdot ) \bigr] \bigr)= \inf_{u_{i}(\cdot )\in \mathcal{U}_{i}[0,T]}J_{i} \bigl(x_{i0}, \bar{x}(\cdot ),u_{0}(\cdot );u_{i}(\cdot ) \bigr). $$

Step 2 (CC-F): Apply the state-aggregation method to determine the state-average limit by the following consistency condition qualification:

$$ \mathbb{E}\bigl[\bar{x}_{i} \bigl(\bar{u}_{i} \bigl[u_{0}(\cdot ),\bar{x}(\cdot ) \bigr] \bigr)\big|\mathcal{F}_{t}^{w_{0}} \bigr]=\bar{x}. $$

By virtue of such steps, the Nash equilibrium response functional of the follower and \(\bar{x}=\bar{x}(u_{0})\) can be specified, given any admissible strategy announced by leaders. Given the optimal response of all followers, we can turn to solve the problem of the leader.

4 Optimal strategy of auxiliary problems

From now on, we might suppress time variable t in case no confusion occurs. As mentioned before, we focus on the auxiliary limiting LQG optimization problems, i.e., Problem (II) first.

4.1 Optimal strategy of the follower

The main result of this section can be stated as follows.

Theorem 4.1

Under assumption (H1), let \(u_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\), \(\bar{x}(\cdot )\in L^{2}(0,T;\mathbb{R})\) be given. Then, for the initial value \(x_{i0}\), Problem (II)admits an optimal control \(\bar{u}_{i}(\cdot )\in \mathcal{U}_{i}[0,T]\) if and only if the following two conditions hold:

  1. (i)

    For \(i=1,2,\ldots ,N\), the adapted solution \((\bar{x}_{i}(\cdot ),\bar{y}_{i}(\cdot ),\bar{z}_{i}(\cdot ))\) to the FBSDE on \([0,T]\)

    $$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{i}=\{A\bar{x}_{i}+B\bar{u}_{i}+E \bar{x}+\alpha x_{0}\}\,\mathrm{d}t+ \{C\bar{x}_{i}+D \bar{u}_{i}+F\bar{x}+\beta x_{0}\}\,\mathrm{d}W_{i}(t), \\ \mathrm{d} \bar{y}_{i}=- \{A\bar{y}_{i}+C \bar{z}_{i}+Q(\bar{x}_{i}-\bar{x}) \}\,\mathrm{d}t+ \bar{z}_{i}\,\mathrm{d}W_{i}(t), \\ \bar{x}_{i}(0)=x_{i0},\qquad \bar{y}_{i}(T)=H \bar{x}_{i}(T), \end{cases} $$
    (4.1)

    satisfies the following stationarity condition:

    $$ B\bar{y}_{i}+R\bar{u}_{i}+D \bar{z}_{i}=0,\quad \textit{a.e. } t\in [0,T], \textit{a.s.} $$
    (4.2)
  2. (ii)

    For \(i=1,2,\ldots ,N\), the following convexity condition holds:

    $$ \mathbb{E}\biggl\{ \int _{0}^{T} \bigl(Q\tilde{x}_{i}^{2}+Ru_{i}^{2} \bigr)\,\mathrm{d}t+H \tilde{x}_{i}^{2}(T) \biggr\} \geqslant 0,\quad \forall u_{i}(\cdot )\in \mathcal{U}_{i}[0,T], $$
    (4.3)

    where \(\tilde{x}_{i}(\cdot )\) is the solution of

    $$ \textstyle\begin{cases} \mathrm{d} \tilde{x}_{i}=\{A\tilde{x}_{i}+Bu_{i}\} \,\mathrm{d}t+\{C\tilde{x}_{i}+Du_{i} \}\,\mathrm{d}W_{i}(t),\quad t\in [0,T], \\ \tilde{x}_{i}(0)=0. \end{cases} $$
    (4.4)

    Or, equivalently, the mapping \(u_{i}(\cdot )\mapsto J_{i}(x_{i0},\bar{x}(\cdot ),u_{0}(\cdot );u_{i}( \cdot ))\), defined by (3.2), is convex (for \(i=1,2,\ldots ,N\)).

Proof

For given \(u_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\), \(\bar{x}(\cdot )\in L^{2}(0,T;\mathbb{R})\) and \(\bar{u}_{i}(\cdot )\in \mathcal{U}_{i}[0,T]\), let \((\bar{x}_{i}(\cdot ), \bar{y}_{i}(\cdot ), \bar{z}_{i}(\cdot ))\) be an adapted solution to FBSDE (4.1). For any \(u_{i}(\cdot )\in \mathcal{U}_{i}[0,T]\) and \(\varepsilon \in \mathbb{R}\), let \(x_{i}^{\varepsilon }(\cdot )\) be the solution to the following perturbed state equation on \([0,T]\):

$$ \textstyle\begin{cases} \mathrm{d} x_{i}^{\varepsilon }= \{Ax_{i}^{\varepsilon }+B(\bar{u}_{i}+\varepsilon u_{i})+E\bar{x}+\alpha x_{0} \}\,\mathrm{d}t+ \{Cx_{i}^{\varepsilon }+D(\bar{u}_{i}+\varepsilon u_{i})+F\bar{x}+\beta x_{0} \}\,\mathrm{d}W_{i}(t), \\ x_{i}^{\varepsilon }(0)=x_{i0}. \end{cases} $$

Then, denoting by \(\tilde{x}_{i}(\cdot )\) the solution of (4.4), we have \(x_{i}^{\varepsilon }(\cdot )=\bar{x}_{i}(\cdot )+\varepsilon \tilde{x}_{i}(\cdot )\) and

$$ \begin{aligned} &J_{i} \bigl(x_{i0},\bar{x}( \cdot ),u_{0}(\cdot );\bar{u}_{i}(\cdot )+\varepsilon u_{i}( \cdot ) \bigr)-J_{i} \bigl(x_{i0}, \bar{x}(\cdot ),u_{0}(\cdot );\bar{u}_{i}(\cdot ) \bigr) \\ &\quad =\frac{\varepsilon }{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl(2Q(\bar{x}_{i}- \bar{x}) \tilde{x}_{i}+\varepsilon Q\tilde{x}_{i}^{2}+2R \bar{u}_{i}u_{i}+\varepsilon Ru_{i}^{2} \bigr) \,\mathrm{d}t \\ &\qquad {} +2H\bar{x}_{i}(T)\tilde{x}_{i}(T)+\varepsilon H \tilde{x}_{i}^{2}(T) \biggr\} \\ &\quad =\varepsilon \mathbb{E} \biggl\{ \int _{0}^{T} \bigl(Q(\bar{x}_{i}- \bar{x})\tilde{x}_{i}+R \bar{u}_{i}u_{i} \bigr)\,\mathrm{d}t+H\bar{x}_{i}(T)\tilde{x}_{i}(T) \biggr\} \\ &\qquad {} +\frac{\varepsilon ^{2}}{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl(Q\tilde{x}_{i}^{2}+Ru_{i}^{2} \bigr) \,\mathrm{d}t+H\tilde{x}_{i}^{2}(T) \biggr\} . \end{aligned}$$

On the other hand, applying Itô’s formula to \(\tilde{x}_{i}\bar{y}_{i}\) and taking expectation, we obtain

$$ \mathbb{E}\bigl[H\bar{x}_{i}(T)\tilde{x}_{i}(T) \bigr]=\mathbb{E}\biggl[ \int _{0}^{T} (B\bar{y}_{i}+D \bar{z}_{i})u_{i}-Q(\bar{x}_{i}-\bar{x}) \tilde{x}_{i} \,\mathrm{d}t \biggr]. $$

Hence,

$$ \begin{aligned} &J_{i} \bigl(x_{i0},\bar{x}( \cdot ),u_{0}(\cdot );\bar{u}_{i}(\cdot )+\varepsilon u_{i}( \cdot ) \bigr)-J_{i} \bigl(x_{i0}, \bar{x}(\cdot ),u_{0}(\cdot );\bar{u}_{i}(\cdot ) \bigr) \\ &\quad =\varepsilon \mathbb{E} \biggl\{ \int _{0}^{T} (B\bar{y}_{i}+R \bar{u}_{i}+D \bar{z}_{i})u_{i}\,\mathrm{d}t \biggr\} +\frac{\varepsilon ^{2}}{2}\mathbb{E} \biggl\{ \int _{0}^{T} \bigl(Q\tilde{x}_{i}^{2}+Ru_{i}^{2} \bigr)\,\mathrm{d}t+H\tilde{x}_{i}^{2}(T) \biggr\} . \end{aligned}$$

It follows that

$$ \begin{aligned} &J_{i} \bigl(x_{i0}, \bar{x}( \cdot ),u_{0}(\cdot );\bar{u}_{i}(\cdot ) \bigr) \leqslant J_{i} \bigl(x_{i0},\bar{x}(\cdot ),u_{0}(\cdot );\bar{u}_{i}( \cdot )+\varepsilon u_{i}(\cdot ) \bigr), \\ &\quad \forall u_{i}(\cdot )\in \mathcal{U}_{i}[0,T], \forall \varepsilon \in \mathbb{R}, \end{aligned} $$

if and only if (4.2) and (4.3) hold. □

By assumption \(R>0\), we can figure out that the optimal response is

$$ \bar{u}_{i}=-R^{-1}(B \bar{y}_{i}+D\bar{z}_{i}), $$
(4.5)

so the related Hamiltonian system can be represented by

$$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{i}= \{A \bar{x}_{i}-BR^{-1}(B\bar{y}_{i}+D \bar{z}_{i})+E \bar{x}+\alpha x_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}_{i}={}}{} + \{C\bar{x}_{i}-DR^{-1}(B\bar{y}_{i}+D \bar{z}_{i})+F\bar{x}+ \beta x_{0} \}\,\mathrm{d}W_{i}(t), \\ \mathrm{d} \bar{y}_{i}=- \{A\bar{y}_{i}+C \bar{z}_{i}+Q(\bar{x}_{i}-\bar{x}) \}\,\mathrm{d}t+ \bar{z}_{i}\,\mathrm{d}W_{i}(t), \\ \bar{x}_{i}(0)=x_{i0}, \qquad \bar{y}_{i}(T)=H \bar{x}_{i}(T),\quad i=1,2, \ldots ,N. \end{cases} $$

Based on the above analysis, we have

$$ \bar{x}(\cdot )=\lim_{N\rightarrow +\infty } \frac{1}{N}\sum_{i=1}^{N} \bar{x}_{i}(\cdot )=\mathbb{E}\bigl[\bar{x}_{i}(\cdot ) \bigr]. $$
(4.6)

Here, the first equality of (4.6) is due to the consistency condition by which the frozen term \(\bar{x}(\cdot )\) should equal the average limit of all realized states \(\bar{x}_{i}(\cdot )\); the second equality is due to the law of large numbers on common noise. Thus, by replacing with \(\mathbb{E}[\bar{x}_{i}]\), we get the following system:

$$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{i}= \{A\bar{x}_{i}-BR^{-1}(B \bar{y}_{i}+D\bar{z}_{i})+E \mathbb{E}[\bar{x}_{i}]+ \alpha x_{0} \}\,\mathrm{d}t \\ \hphantom{ \mathrm{d} \bar{x}_{i}={}}{} + \{C\bar{x}_{i}-DR^{-1}(B\bar{y}_{i}+D \bar{z}_{i})+F\mathbb{E}[ \bar{x}_{i}]+\beta x_{0} \}\,\mathrm{d}W_{i}(t), \\ \mathrm{d} \bar{y}_{i}=- \{A\bar{y}_{i}+C \bar{z}_{i}+Q (\bar{x}_{i}-\mathbb{E}[ \bar{x}_{i}] ) \}\,\mathrm{d}t+\bar{z}_{i}\,\mathrm{d}W_{i}(t), \\ \bar{x}_{i}(0)=x_{i0}, \qquad \bar{y}_{i}(T)=H \bar{x}_{i}(T), \quad i=1,2, \ldots ,N. \end{cases} $$
(4.7)

As all agents are statistically identical, we may suppress the subscript “i”, and the following consistency condition system arises for a “representative” agent:

$$ \textstyle\begin{cases} \mathrm{d} \bar{x}= \{A \bar{x}-BR^{-1}(B\bar{y}+D\bar{z})+E\mathbb{E}[\bar{x}]+ \alpha x_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}= {}}{} + \{C\bar{x}-DR^{-1}(B\bar{y}+D\bar{z})+F\mathbb{E}[\bar{x}]+\beta x_{0} \}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}=- \{A\bar{y}+C\bar{z}+Q (\bar{x}-\mathbb{E}[\bar{x}] ) \}\,\mathrm{d}t+ \bar{z}\,\mathrm{d}W(t), \\ \bar{x}(0)=x,\qquad \bar{y}(T)=H\bar{x}(T), \end{cases} $$
(4.8)

where \(W(\cdot )\) stands for a generic Brownian motion on \((\Omega ,\mathcal{F},\mathbb{P})\) that is independent of \(W_{0}\). x is a representative element of \(\{x_{i0}\}_{1\leqslant i\leqslant N}\), and \(x_{0}(\cdot )\) is a quantity need to be determined by further consistency condition analysis, to be given later.

4.2 Optimal strategy of the leader

Once Problem (II) is solved, we turn to finding the optimal control of the leader (agent \(\mathcal{A}_{0}\)). Note that when the followers take their optimal response \(\bar{u}_{i}(\cdot )\) given by (4.5), the major leader ends up with the following state equation system:

$$ \textstyle\begin{cases} \mathrm{d} x_{0}=\{A_{0}x_{0}+B_{0}u_{0}+C_{0}z_{0} \}\,\mathrm{d}t+z_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}= \{A\bar{x}-BR^{-1}(B\bar{y}+D\bar{z})+E\mathbb{E}[ \bar{x}]+ \alpha x_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}={}}{} + \{C\bar{x}-DR^{-1}(B\bar{y}+D\bar{z})+F\mathbb{E}[\bar{x}]+\beta x_{0} \}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}=- \{A\bar{y}+C\bar{z}+Q (\bar{x}-\mathbb{E}[\bar{x}] ) \}\,\mathrm{d}t+ \bar{z}\,\mathrm{d}W(t), \\ x_{0}(T)=\xi ,\qquad \bar{x}(0)=x,\qquad \bar{y}(T)=H\bar{x}(T). \end{cases} $$
(4.9)

And its corresponding cost functional is

$$ J_{0}(u_{0})=\frac{1}{2} \mathbb{E} \biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x_{0}(t)- \bar{x}(t) \bigr)^{2}+ \tilde{Q}x_{0}^{2}(t)+R_{0} u_{0}^{2}(t) \bigr]\,\mathrm{d}t+H_{0} x_{0}^{2}(0) \biggr\} . $$
(4.10)

We present the optimal control problem for the leader as follows.

Problem (III)

When the followers take their optimal response \(\bar{u}_{i}(\cdot )\) given by (4.5), find the optimal control \(\bar{u}_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\) such that

$$ J_{0} \bigl(\bar{u}_{0}(\cdot ) \bigr)= \inf _{u_{0}(\cdot )\in \mathcal{U}_{0}[0,T]}J_{0} \bigl(u_{0}( \cdot ) \bigr). $$

The main result of this section can be stated as follows.

Theorem 4.2

Under assumption (H1), the followers take their optimal response \(\bar{u}_{i}(\cdot )\) given by (4.5). Then, for the terminal value \(\xi \in L^{2}_{\mathcal{F}_{T}^{w_{0}}}(\Omega ;\mathbb{R})\), Problem (III)admits an optimal control \(\bar{u}_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\) if and only if the following two conditions hold:

  1. (i)

    The adapted solution \((\bar{x}_{0}(\cdot ),\bar{z}_{0}(\cdot ),\bar{x}(\cdot ),\bar{y}(\cdot ), \bar{z}(\cdot ),\bar{y}_{0}(\cdot ),\bar{p}(\cdot ),\bar{q}(\cdot ),\bar{k}( \cdot ))\) to the FBSDE on \([0,T]\)

    $$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{0}=\{A_{0}\bar{x}_{0}+B_{0} \bar{u}_{0}+C_{0}\bar{z}_{0} \}\,\mathrm{d}t+ \bar{z}_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}= \{A\bar{x}-BR^{-1}(B\bar{y}+D\bar{z})+E\mathbb{E}[ \bar{x}]+ \alpha \bar{x}_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}={}}{} + \{C\bar{x}-DR^{-1}(B\bar{y}+D\bar{z})+F\mathbb{E}[\bar{x}]+\beta \bar{x}_{0} \}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}=- \{A\bar{y}+C\bar{z}+Q (\bar{x}-\mathbb{E}[\bar{x}] ) \}\,\mathrm{d}t+ \bar{z}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}_{0}=- \{A_{0} \bar{y}_{0}+\alpha \bar{p}+\beta \bar{q}+Q_{0}( \bar{x}_{0}-\bar{x})+\tilde{Q}\bar{x}_{0} \}\,\mathrm{d}t-C_{0}\bar{y}_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{p}=- \{A\bar{p}+C\bar{q}+E\mathbb{E}[\bar{p}]+F\mathbb{E}[\bar{q}]+Q \mathbb{E}[\bar{k}]-Q_{0}(\bar{x}_{0}-\bar{x})-Q\bar{k} \}\,\mathrm{d}t \\ \hphantom{ \mathrm{d} \bar{p}={}}{}+\bar{q}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{k}= \{B^{2}R^{-1}\bar{p}+BDR^{-1} \bar{q}+A\bar{k} \}\,\mathrm{d}t+ \{BDR^{-1} \bar{p}+D^{2}R^{-1} \bar{q}+C\bar{k} \}\,\mathrm{d}W(t), \\ \bar{x}_{0}(T)=\xi ,\qquad \bar{x}(0)=x,\qquad \bar{y}(T)=H\bar{x}(T), \\ \bar{y}_{0}(0)=-H_{0}\bar{x}_{0}(0),\qquad \bar{p}(T)=-H\bar{k}(T),\qquad \bar{k}(0)=0, \end{cases} $$
    (4.11)

    satisfies the following stationarity condition:

    $$ B_{0}\bar{y}_{0}+R_{0} \bar{u}_{0}=0,\quad \textit{a.e. } t\in [0,T], \textit{a.s.} $$
    (4.12)
  2. (ii)

    The following convexity condition holds:

    $$ \mathbb{E}\biggl\{ \int _{0}^{T} \bigl(Q_{0}( \tilde{x}_{0}-\tilde{x})^{2}+ \tilde{Q} \tilde{x}_{0}^{2}+R_{0}u_{0}^{2} \bigr)\,\mathrm{d}t+H_{0}\tilde{x}_{0}^{2}(0) \biggr\} \geqslant 0, \quad \forall u_{0}(\cdot )\in \mathcal{U}_{0}[0,T], $$
    (4.13)

    where \((\tilde{x}_{0}(\cdot ),\tilde{z}_{0}(\cdot ),\tilde{x}(\cdot ),\tilde{y}( \cdot ),\tilde{z}(\cdot ))\) is the solution to the BFSDE

    $$ \textstyle\begin{cases} \mathrm{d} \tilde{x}_{0}=\{A_{0}\tilde{x}_{0}+B_{0}u_{0}+C_{0} \tilde{z}_{0} \}\,\mathrm{d}t+\tilde{z}_{0}\,\mathrm{d}W_{0}(t),\quad t\in [0,T], \\ \mathrm{d} \tilde{x}= \{A\tilde{x}-BR^{-1}(B\tilde{y}+D \tilde{z})+E\mathbb{E}[ \tilde{x}]+\alpha \tilde{x}_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \tilde{x}={}}{} + \{C\tilde{x}-DR^{-1}(B\tilde{y}+D\tilde{z})+F\mathbb{E}[ \tilde{x}]+ \beta \tilde{x}_{0} \}\,\mathrm{d}W(t), \\ \mathrm{d} \tilde{y}=- \{A\tilde{y}+C\tilde{z}+Q (\tilde{x}-\mathbb{E}[ \tilde{x}] ) \}\,\mathrm{d}t+\tilde{z}\,\mathrm{d}W(t), \\ \tilde{x}_{0}(T)=0,\qquad \tilde{x}(0)=0,\qquad \tilde{y}(T)=H \tilde{x}(T). \end{cases} $$
    (4.14)

    Or, equivalently, the mapping \(u_{0}(\cdot )\mapsto J_{0}(u_{0}(\cdot ))\), defined by (4.10), is convex.

Proof

For given \(\xi \in L^{2}_{\mathcal{F}_{T}^{w_{0}}}(\Omega ;\mathbb{R})\) and \(\bar{u}_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\), let \((\bar{x}_{0}(\cdot ),\bar{z}_{0}(\cdot ),\bar{x}(\cdot ),\bar{y}(\cdot ), \bar{z}(\cdot ),\bar{y}_{0}(\cdot ),\bar{p}(\cdot ), \bar{q}(\cdot ),\bar{k}( \cdot ))\) be an adapted solution to FBSDE (4.11). For any \(u_{0}(\cdot )\in \mathcal{U}_{0}[0,T]\) and \(\varepsilon \in \mathbb{R}\), let \((x_{0}^{\varepsilon }(\cdot ),z_{0}^{\varepsilon }(\cdot ),x^{\varepsilon }(\cdot ), y^{\varepsilon }(\cdot ),z^{\varepsilon }(\cdot ))\) be the solution to the following perturbed state equation on \([0,T]\):

$$ \textstyle\begin{cases} \mathrm{d} x_{0}^{\varepsilon }= \{A_{0}x_{0}^{\varepsilon }+B_{0}(\bar{u}_{0}+\varepsilon u_{0})+C_{0}z_{0}^{\varepsilon }\}\,\mathrm{d}t+z_{0}^{\varepsilon }\,\mathrm{d}W_{0}(t), \\ \mathrm{d} x^{\varepsilon }= \{Ax^{\varepsilon }-BR^{-1} (By^{\varepsilon }+Dz^{\varepsilon })+E\mathbb{E}[x^{\varepsilon }]+\alpha x_{0}^{\varepsilon }\}\,\mathrm{d}t \\ \hphantom{\mathrm{d} x^{\varepsilon }={}}{} + \{Cx^{\varepsilon }-DR^{-1} (By^{\varepsilon }+Dz^{\varepsilon })+F\mathbb{E}[x^{\varepsilon }]+\beta x_{0}^{\varepsilon }\} \,\mathrm{d}W(t), \\ \mathrm{d} y^{\varepsilon }=- \{Ay^{\varepsilon }+Cz^{\varepsilon }+Q (x^{\varepsilon }-\mathbb{E}[x^{\varepsilon }] ) \}\,\mathrm{d}t+z^{\varepsilon }\,\mathrm{d}W(t), \\ x_{0}^{\varepsilon }(T)=\xi ,\qquad x^{\varepsilon }(0)=x,\qquad y^{\varepsilon }(T)=Hx^{\varepsilon }(T). \end{cases} $$

Then, denoting by \((\tilde{x}_{0}(\cdot ),\tilde{z}_{0}(\cdot ),\tilde{x},\tilde{y}, \tilde{z})\) the solution of (4.14), we have \(x_{0}^{\varepsilon }(\cdot )=\bar{x}_{0}(\cdot )+\varepsilon \tilde{x}_{0}(\cdot )\), \(z_{0}^{\varepsilon }(\cdot )=\bar{z}_{0}(\cdot )+\varepsilon \tilde{z}_{0}(\cdot )\), \(x^{\varepsilon }(\cdot )=\bar{x}(\cdot )+\varepsilon \tilde{x}(\cdot )\), \(y^{\varepsilon }(\cdot )=\bar{y}(\cdot )+\varepsilon \tilde{y}(\cdot )\), \(z^{\varepsilon }(\cdot )=\bar{z}(\cdot )+\varepsilon \tilde{z}(\cdot )\), and

$$ \begin{aligned} &J_{0} \bigl(\bar{u}_{0}( \cdot )+\varepsilon u_{0}(\cdot ) \bigr)-J_{0} \bigl( \bar{u}_{0}(\cdot ) \bigr) \\ &\quad =\varepsilon \mathbb{E}\biggl\{ \int _{0}^{T} \bigl(Q_{0}( \bar{x}_{0}-\bar{x}) (\tilde{x}_{0}- \tilde{x})+ \tilde{Q}\bar{x}_{0}\tilde{x}_{0}+R_{0} \bar{u}_{0}u_{0} \bigr) \,\mathrm{d}t+H_{0} \bar{x}_{0}(0)\tilde{x}_{0}(0) \biggr\} \\ &\qquad {} +\frac{\varepsilon ^{2}}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl(Q_{0}( \tilde{x}_{0}-\tilde{x})^{2}+\tilde{Q} \tilde{x}_{0}^{2}+R_{0}u_{0}^{2} \bigr)\,\mathrm{d}t+H_{0}\tilde{x}_{0}^{2}(0) \biggr\} . \end{aligned}$$

On the other hand, applying Itô’s formula to \(\tilde{x}_{0}\bar{y}_{0}+\tilde{x}\bar{p}+\tilde{y}\bar{k}\) and taking expectation, we obtain

$$ \mathbb{E}\bigl[H_{0}\bar{x}_{0}(0)\tilde{x}_{0}(0) \bigr]=\mathbb{E}\biggl[ \int _{0}^{T} \bigl(B_{0} \bar{y}_{0}u_{0}-Q_{0}( \bar{x}_{0}-\bar{x}) (\tilde{x}_{0}- \tilde{x})- \tilde{Q}\bar{x}_{0}\tilde{x}_{0} \bigr)\,\mathrm{d}t \biggr]. $$

Hence,

$$ \begin{aligned} &J_{0} \bigl(\bar{u}_{0}( \cdot )+\varepsilon u_{0}(\cdot ) \bigr)-J_{0} \bigl( \bar{u}_{0}(\cdot ) \bigr) \\ &\quad =\varepsilon \mathbb{E} \biggl\{ \int _{0}^{T} (B_{0} \bar{y}_{0}+R_{0}\bar{u}_{0})u_{0} \,\mathrm{d}t \biggr\} \\ &\qquad {}+\frac{\varepsilon ^{2}}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl(Q_{0}( \tilde{x}_{0}-\tilde{x})^{2}+\tilde{Q} \tilde{x}_{0}^{2}+R_{0}u_{0}^{2} \bigr)\,\mathrm{d}t+H_{0}\tilde{x}_{0}^{2}(0) \biggr\} . \end{aligned}$$

It follows that

$$ J_{0} \bigl(\bar{u}_{0}(\cdot ) \bigr)\leqslant J_{0} \bigl(\bar{u}_{0}(\cdot )+\varepsilon u_{0}( \cdot ) \bigr),\quad \forall u_{0}(\cdot )\in \mathcal{U}_{0}[0,T], \forall \varepsilon \in \mathbb{R}, $$

if and only if (4.12) and (4.13) hold. □

Since \(R_{0}>0\), furthermore, we can compute out the optimal control for the leader agent \(\mathcal{A}_{0}\) is

$$ \bar{u}_{0}=-R_{0}^{-1}B_{0} \bar{y}_{0}, $$
(4.15)

so we can finally get the consistency condition for the auxiliary problems as follows:

$$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{0}= \{A_{0}\bar{x}_{0}-B_{0}^{2} R_{0}^{-1}\bar{y}_{0}+C_{0} \bar{z}_{0} \}\,\mathrm{d}t+\bar{z}_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}= \{A\bar{x}-BR^{-1}(B\bar{y}+D\bar{z})+E\mathbb{E}[ \bar{x}]+ \alpha \bar{x}_{0} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}={}}{} + \{C\bar{x}-DR^{-1}(B\bar{y}+D\bar{z})+F\mathbb{E}[\bar{x}]+\beta \bar{x}_{0} \}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}=- \{A\bar{y}+C\bar{z}+Q (\bar{x}-\mathbb{E}[\bar{x}] ) \}\,\mathrm{d}t+ \bar{z}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}_{0}=- \{A_{0} \bar{y}_{0}+\alpha \bar{p}+\beta \bar{q}+Q_{0}( \bar{x}_{0}-\bar{x})+\tilde{Q}\bar{x}_{0} \}\,\mathrm{d}t-C_{0}\bar{y}_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{p}=- \{A\bar{p}+C\bar{q}+E\mathbb{E}[\bar{p}]+F\mathbb{E}[\bar{q}]+Q \mathbb{E}[\bar{k}]-Q_{0}(\bar{x}_{0}-\bar{x})-Q\bar{k} \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{p}={}}{}+\bar{q}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{k}= \{B^{2}R^{-1}\bar{p}+BDR^{-1} \bar{q}+A\bar{k} \}\,\mathrm{d}t+ \{BDR^{-1} \bar{p}+D^{2}R^{-1} \bar{q}+C\bar{k} \}\,\mathrm{d}W(t), \\ \bar{x}_{0}(T)=\xi , \qquad \bar{x}(0)=x, \qquad \bar{y}(T)=H\bar{x}(T), \\ \bar{y}_{0}(0)=-H_{0}\bar{x}_{0}(0),\qquad \bar{p}(T)=-H\bar{k}(T),\qquad \bar{k}(0)=0. \end{cases} $$
(4.16)

5 The consistency condition system

By the results in the last section, we can find the optimal response of the followers and the optimal control of the leader if we can show the well-posedness of coupled BFSDE (4.16). In this section, we turn to verify its well-posedness (refer to [19]) since it is important to the decentralized strategy design. To get the well-posedness of (4.16), we give the following assumption:

  1. (H2)

    \(B_{0}\neq 0\), \(H_{0}>0\), \(\tilde{Q}>0\).

Theorem 5.1

Under assumption (H2), FBSDE (4.16) is uniquely solvable.

Proof

Uniqueness. For the sake of notational convenience, in (4.16) we denote by \(b(\phi )\), \(\sigma (\phi )\) the coefficients of drift and diffusion terms, respectively, for \(\phi =\bar{y}_{0},\bar{x},\bar{k}\); denote by \(f(\psi )\) the generator for \(\psi =\bar{x}_{0},\bar{p},\bar{y}\).

Define \(\Delta :=(\bar{y}_{0},\bar{x},\bar{k},\bar{x}_{0},\bar{p},\bar{y}, \bar{z}_{0},\bar{q},\bar{z})\), similar to the notation in Peng and Wu [19], we denote

$$ \mathbb{A}(t,\Delta ):= \bigl(-f(\bar{x}_{0}),-f(\bar{p}),-f(\bar{y}), b( \bar{y}_{0}),b(\bar{x}),b(\bar{k}),\sigma (\bar{y}_{0}), \sigma (\bar{x}), \sigma (\bar{k}) \bigr), $$

which implies \(\mathbb{A}(t,\Delta )= (A_{0}\bar{x}_{0}-B_{0}^{2} R_{0}^{-1}\bar{y}_{0}+C_{0} \bar{z}_{0}, - (A\bar{p}+C\bar{q}+E\mathbb{E}[\bar{p}]+F\mathbb{E}[\bar{q}]+Q \mathbb{E}[\bar{k}]-Q_{0}(\bar{x}_{0}-\bar{x})-Q\bar{k} ), - (A \bar{y}+C\bar{z}+Q(\bar{x}-\mathbb{E}[\bar{x}]) ), - (A_{0}\bar{y}_{0}+ \alpha \bar{p}+\beta \bar{q}+Q_{0}(\bar{x}_{0}-\bar{x})+\tilde{Q}\bar{x}_{0} ), A\bar{x}-B^{2}R^{-1}\bar{y}-BDR^{-1}\bar{z}+E\mathbb{E}[\bar{x}]+ \alpha \bar{x}_{0}, B^{2}R^{-1}\bar{p}+BDR^{-1}\bar{q}+A\bar{k}, -C_{0} \bar{y}_{0}, C\bar{x}-BDR^{-1}\bar{y}-D^{2}R^{-1}\bar{z}+F\mathbb{E}[ \bar{x}]+\beta \bar{x}_{0}, BDR^{-1}\bar{p}+D^{2}R^{-1}\bar{q}+C\bar{k} )\).

Then, for any \(\Delta ^{i}=(\bar{y}_{0}^{i},\bar{x}^{i},\bar{k}^{i},\bar{x}_{0}^{i}, \bar{p}^{i},\bar{y}^{i},\bar{z}_{0}^{i},\bar{q}^{i}, \bar{z}^{i})\), \(i=1,2\), we have

$$ \begin{aligned} &\mathbb{E}{ \bigl\langle }\mathbb{A}\bigl(t,\Delta ^{1} \bigr)-\mathbb{A}\bigl(t,\Delta ^{2} \bigr),\Delta ^{1}-\Delta ^{2} { \bigr\rangle } \\ &\quad =\mathbb{E}\bigl[-B_{0}^{2}R_{0}^{-1} \bigl(\bar{y}_{0}^{1}-\bar{y}_{0}^{2} \bigr)^{2}-Q_{0} \bigl[ \bigl(\bar{x}^{1}- \bar{x}^{2} \bigr)- \bigl(\bar{x}_{0}^{1}- \bar{x}_{0}^{2} \bigr) \bigr]^{2}-\tilde{Q} \bigl(\bar{x}_{0}^{1}-\bar{x}_{0}^{2} \bigr)^{2} \bigr] \\ &\quad \leqslant \mathbb{E}\bigl[-B_{0}^{2}R_{0}^{-1} \bigl(\bar{y}_{0}^{1}-\bar{y}_{0}^{2} \bigr)^{2}- \tilde{Q} \bigl(\bar{x}_{0}^{1}- \bar{x}_{0}^{2} \bigr)^{2} \bigr] \\ &\quad :=\mathbb{E}\bigl[-\beta _{1} \bigl(\bar{y}_{0}^{1}- \bar{y}_{0}^{2} \bigr)^{2}-\beta _{2} \bigl( \bar{x}_{0}^{1}- \bar{x}_{0}^{2} \bigr)^{2} \bigr]. \end{aligned} $$

In the following, we are first going to show that (4.16) admits at most one adapted solution. Suppose that \(\Delta ^{i}\), \(i=1,2\), are two solutions of (4.16). Setting \(\widehat {\Delta }=(\widehat {y}_{0},\widehat {x},\widehat {k},\widehat {x}_{0},\widehat {p}, \widehat {y},\widehat {z}_{0},\widehat {q}, \widehat {z})= (\bar{y}_{0}^{1}-\bar{y}_{0}^{2},\bar{x}^{1}-\bar{x}^{2}, \bar{k}^{1}-\bar{k}^{2},\bar{x}_{0}^{1}-\bar{x}_{0}^{2}, \bar{p}^{1}- \bar{p}^{2},\bar{y}^{1}-\bar{y}^{2},\bar{z}_{0}^{1}-\bar{z}_{0}^{2}, \bar{q}^{1}-\bar{q}^{2},\bar{z}^{1}-\bar{z}^{2})\) and applying Itô’s formula to \({\langle }\widehat {y}_{0}\), \(\widehat {x}_{0} {\rangle }+ { \langle }\widehat {x}\), \(\widehat {p} {\rangle }+ {\langle }\widehat {k}\), ŷ〉, we have

$$ \begin{aligned} -\mathbb{E}{\langle }\widehat {y}_{0},\widehat {x}_{0} {\rangle }&=\mathbb{E}\biggl[ \int _{0}^{T} { \bigl\langle }\mathbb{A}\bigl(t, \Delta ^{1} \bigr)-\mathbb{A}\bigl(t,\Delta ^{2} \bigr), \widehat { \Delta } { \bigr\rangle }\,\mathrm{d}s \biggr] \\ &\leqslant -\beta _{1}\mathbb{E}\biggl[ \int _{0}^{T} \bigl(\bar{y}_{0}^{1}- \bar{y}_{0}^{2} \bigr)^{2} \,\mathrm{d}s \biggr]- \beta _{2}\mathbb{E}\biggl[ \int _{0}^{T} \bigl(\bar{x}_{0}^{1}- \bar{x}_{0}^{2} \bigr)^{2} \,\mathrm{d}s \biggr]. \end{aligned} $$

It follows that

$$ \beta _{1}\mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert \widehat {y}_{0}(s) \bigr\vert ^{2}\,\mathrm{d}s \biggr]+\beta _{2} \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert \widehat {x}_{0}(s) \bigr\vert ^{2}\,\mathrm{d}s \biggr]+H_{0}\mathbb{E}\bigl\vert \widehat {x}_{0}(0) \bigr\vert ^{2} \leqslant 0. $$

By (H2), we get \(\beta _{1}>0\) and \(\beta _{2}>0\). Then \(\widehat {y}_{0}(s)\equiv 0\), \(\widehat {x}_{0}(s)\equiv 0\). Furthermore, there is \(\widehat {z}_{0}(s)\equiv 0\). Applying the basic technique to \(\widehat {x}(s)\) and \(\widehat {y}(s)\) and using Gronwall’s inequality, we obtain \(\widehat {x}(s)\equiv 0\), \(\widehat {y}(s)\equiv 0\), and \(\widehat {z}(s)\equiv 0\). Similarly, we have \(\widehat {k}(s)\equiv 0\), \(\widehat {p}(s)\equiv 0\), and \(\widehat {q}(s)\equiv 0\). Therefore, (4.16) admits at most one adapted solution.

Existence. In order to prove the existence of the solution, we first consider the following family of FBSDEs parameterized by \(\gamma \in [0,1]\):

$$ \textstyle\begin{cases} \mathrm{d} \bar{y}_{0}^{\gamma }= [-(1-\gamma ) \bar{x}_{0}^{\gamma }\beta _{2}+\gamma b ( \bar{y}_{0}^{\gamma })+\varphi _{t}^{1} ]\,\mathrm{d}t+ [\gamma \sigma (\bar{y}_{0}^{\gamma })+\lambda _{t}^{1} ]\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}_{0}^{\gamma }= [-(1-\gamma ) \bar{y}_{0}^{\gamma }\beta _{1}-\gamma f ( \bar{x}_{0}^{\gamma })+\kappa _{t}^{2} ]\,\mathrm{d}t+\bar{z}_{0}^{\gamma }\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}^{\gamma }= [\gamma b (\bar{x}^{\gamma })+\varphi _{t}^{2} ]\,\mathrm{d}t+ [\gamma \sigma (\bar{x}^{\gamma })+\lambda _{t}^{2} ]\,\mathrm{d}W(t), \\ \mathrm{d} \bar{p}^{\gamma }= [-\gamma f (\bar{p}^{\gamma })+\kappa _{t}^{2} ]\,\mathrm{d}t+ \bar{q}^{\gamma }\,\mathrm{d}W(t), \\ \mathrm{d} \bar{k}^{\gamma }= [\gamma b (\bar{k}^{\gamma })+\varphi _{t}^{3} ]\,\mathrm{d}t+ [\gamma \sigma (\bar{k}^{\gamma })+\lambda _{t}^{3} ]\,\mathrm{d}W(t), \\ \mathrm{d} \bar{y}^{\gamma }= [-\gamma f (\bar{y}^{\gamma })+\kappa _{t}^{3} ]\,\mathrm{d}t+ \bar{z}^{\gamma }\,\mathrm{d}W(t), \\ \bar{y}_{0}^{\gamma }(0)=-(1-\gamma )\bar{x}_{0}^{\gamma }(0)-\gamma H_{0}\bar{x}_{0}^{\gamma }(0)+a,\qquad \bar{x}_{0}^{\gamma }(T)=\gamma \xi , \\ \bar{x}^{\gamma }(0)=\gamma x,\qquad \bar{p}^{\gamma }(T)=-\gamma H \bar{k}^{\gamma }(T), \qquad \bar{k}^{\gamma }=0,\qquad \bar{y}^{\gamma }(T)=\gamma H\bar{x}^{\gamma }(T), \end{cases} $$
(5.1)

where \((\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda ^{1},\lambda ^{2},\lambda ^{3},\kappa ^{1}, \kappa ^{2},\kappa ^{3})\in L^{2}_{\mathcal{F}}(0,T;\mathbb{R}^{9})\), \(a\in L^{2}_{\mathcal{F}^{w_{0}}}(\Omega ;\mathbb{R})\). Clearly, when \(\gamma =1\), the existence of (5.1) implies that of (4.16). When \(\gamma =0\), it is easy to obtain that (5.1) admits a unique solution. Actually, the 2-dim FBSDE is very similar to the Hamiltonian system of Lim and Zhou (2001) [16].

If, a priori, for each \((\varphi ^{1},\varphi ^{2},\varphi ^{3},\lambda ^{1},\lambda ^{2},\lambda ^{3},\kappa ^{1}, \kappa ^{2},\kappa ^{3})\in L^{2}_{\mathcal{F}}(0,T;\mathbb{R}^{9})\) and a certain number \(\gamma _{0}\in [0,1)\), there exists a unique tuple \((\bar{y}_{0}^{\gamma },\bar{x}^{\gamma },\bar{k}^{\gamma },\bar{x}_{0}^{\gamma },\bar{p}^{\gamma }, \bar{y}^{\gamma },\bar{z}_{0}^{\gamma },\bar{q}^{\gamma },\bar{z}^{\gamma })\) of (5.1), then for each \(u_{s}= (\bar{y}_{0}(s),\bar{x}(s),\bar{k}(s),\bar{x}_{0}(s), \bar{p}(s),\bar{y}(s),\bar{z}_{0}(s),\bar{q}(s),\bar{z}(s) ) \in L^{2}_{ \mathcal{F}}(0,T;\mathbb{R}^{9})\), there exists a unique tuple \(U_{s}= (\bar{Y}_{0}(s),\bar{X}(s),\bar{K}(s),\bar{X}_{0}(s), \bar{P}(s),\bar{Y}(s),\bar{Z}_{0}(s),\bar{Q}(s),\bar{Z}(s) ) \in L^{2}_{ \mathcal{F}}(0,T;\mathbb{R}^{9})\) satisfying the following FBSDEs:

$$ \textstyle\begin{cases} \mathrm{d} \bar{Y}_{0}= [-(1-\gamma _{0})\bar{X}_{0} \beta _{2}+\gamma _{0} b( \bar{Y}_{0})+ \delta (\bar{x}_{0}\beta _{2}+b( \bar{y}_{0}) )+\varphi _{t}^{1} ] \,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{Y}_{0}={}}{}+ [\gamma _{0}\sigma (\bar{Y}_{0})+\lambda _{t}^{1} ]\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{X}_{0}= [-(1-\gamma _{0}) \bar{Y}_{0}\beta _{1}-\gamma _{0} f( \bar{X}_{0})+\delta (\bar{y}_{0}\beta _{1}-f(\bar{x}_{0}) )+\kappa _{t}^{2} ] \,\mathrm{d}t+\bar{Z}_{0}\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{X}= [\gamma _{0} b(\bar{X})+\delta b(\bar{x})+ \varphi _{t}^{2} ]\,\mathrm{d}t+ [\gamma _{0}\sigma (\bar{X})+\lambda _{t}^{2} ]\,\mathrm{d}W(t), \\ \mathrm{d} \bar{P}= [-\gamma _{0} f(\bar{P})-\delta f(\bar{p})+ \kappa _{t}^{2} ]\,\mathrm{d}t+\bar{Q}\,\mathrm{d}W(t), \\ \mathrm{d} \bar{K}= [\gamma _{0} b(\bar{K})+\delta b(\bar{k})+ \varphi _{t}^{3} ]\,\mathrm{d}t+ [\gamma _{0}\sigma (\bar{K})+\lambda _{t}^{3} ]\,\mathrm{d}W(t), \\ \mathrm{d} \bar{Y}= [-\gamma _{0} f(\bar{Y})-\delta f(\bar{y})+ \kappa _{t}^{3} ]\,\mathrm{d}t+\bar{Z}\,\mathrm{d}W(t), \\ \bar{Y}_{0}(0)=-(1-\gamma _{0})\bar{X}_{0}(0)- \gamma _{0} H_{0}\bar{X}_{0}(0)+ \delta (1-H_{0})\bar{x}_{0}(0)+a, \\ \bar{X}_{0}(T)= \gamma _{0}\xi +\delta \xi , \\ \bar{X}(0)=\gamma _{0} x+\delta x, \qquad \bar{P}(T)=-\gamma _{0} H\bar{K}(T),\qquad \bar{K}=0,\qquad \bar{Y}(T)=\gamma _{0} H \bar{X}(T). \end{cases} $$
(5.2)

In the following, we aim to prove that the mapping defined by

$$ I_{\gamma _{0}+\delta } \bigl(u\times \bar{x}_{0}(0) \bigr)=U\times \bar{X}_{0}(0):L^{2}_{ \mathcal{F}} \bigl(0,T;\mathbb{R}^{9} \bigr)\times L^{2}_{\mathcal{F}}(\Omega ;\mathbb{R}) \rightarrow L^{2}_{ \mathcal{F}} \bigl(0,T;\mathbb{R}^{9} \bigr)\times L^{2}_{\mathcal{F}}(\Omega ;\mathbb{R}) $$

is a contraction.

Introduce \(u'= (\bar{y}'_{0},\bar{x}',\bar{k}',\bar{x}'_{0},\bar{p}', \bar{y}',\bar{z}'_{0},\bar{q}',\bar{z}' )\in L^{2}_{\mathcal{F}}(0,T; \mathbb{R}^{9})\), \(U'\times \bar{X}'_{0}(0)=I_{\gamma _{0}+\delta }(u'\times \bar{x}'_{0}(0))\) and set

$$\begin{aligned}& \begin{aligned} \widehat {u}&=(\widehat {y}_{0},\widehat {x}, \widehat {k},\widehat {x}_{0},\widehat {p},\widehat {y},\widehat {z}_{0},\widehat {q}, \widehat {z}) \\ &= \bigl(\bar{y}_{0}-\bar{y}'_{0}, \bar{x}-\bar{x}',\bar{k}-\bar{k}', \bar{x}_{0}- \bar{x}'_{0}, \bar{p}- \bar{p}',\bar{y}-\bar{y}',\bar{z}_{0}- \bar{z}'_{0}, \bar{q}-\bar{q}',\bar{z}- \bar{z}' \bigr), \end{aligned} \\& \begin{aligned} \widehat {U}&=(\widehat {Y}_{0},\widehat {X},\widehat {K},\widehat {X}_{0},\widehat {P},\widehat {Y},\widehat {Z}_{0},\widehat {Q}, \widehat {Z}) \\ &= \bigl(\bar{Y}_{0}-\bar{Y}'_{0}, \bar{X}-\bar{X}',\bar{K}-\bar{K}', \bar{X}_{0}- \bar{X}'_{0}, \bar{P}- \bar{P}',\bar{Y}-\bar{Y}',\bar{Z}_{0}- \bar{Z}'_{0}, \bar{Q}-\bar{Q}',\bar{Z}- \bar{Z}' \bigr). \end{aligned} \end{aligned}$$

Applying Itô’s formula to \({\langle }\widehat {Y}_{0}\), \(\widehat {X}_{0} {\rangle }+ { \langle }\widehat {X}\), \(\widehat {P} {\rangle }+ {\langle }\widehat {K}\), Ŷ〉, we have

$$ \begin{aligned} & \bigl(\gamma _{0}H_{0}+(1- \gamma _{0}) \bigr)\mathbb{E}\bigl\vert \widehat {X}_{0}(0) \bigr\vert ^{2}+\mathbb{E}\biggl[ \int _{0}^{T} \bigl(\beta _{1} \bigl\vert \widehat {Y}_{0}(s) \bigr\vert ^{2}+\beta _{2} \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr] \\ &\quad \leqslant \delta C_{1}\mathbb{E}\biggl[ \int _{0}^{T} \bigl( \vert \widehat {u}_{s} \vert ^{2}+ \vert \widehat {U}_{s} \vert ^{2} \bigr) \biggr]+\delta C_{1}\mathbb{E}\bigl\vert \widehat {X}_{0}(0) \bigr\vert ^{2}. \end{aligned} $$
(5.3)

On the other hand, since \(\bar{Y}_{0}\) and \(\bar{Y}'_{0}\) are the solutions of SDEs with Itô’s type, applying the usual technique, the estimate for the difference \(\widehat {Y}_{0}=\bar{Y}_{0}-\bar{Y}'_{0}\) is obtained by

$$ \begin{aligned} \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert \widehat {Y}_{0}(s) \bigr\vert ^{2}\,\mathrm{d}s \biggr]\leqslant{} &C_{1}T \delta \mathbb{E}\biggl[ \int _{0}^{T} \vert \widehat {u}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+C_{1}T \mathbb{E}\bigl\vert \widehat {X}_{0}(0) \bigr\vert ^{2}+ C_{1}T\delta \mathbb{E}\bigl\vert \widehat {x}_{0}(0) \bigr\vert ^{2} \\ &{}+C_{1}T\mathbb{E}\biggl[ \int _{0}^{T} \bigl( \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {X}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {P}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {K}(s) \bigr\vert ^{2} \bigr) \,\mathrm{d}s \biggr]. \end{aligned} $$
(5.4)

Similarly, estimates for the difference \(\widehat {X}=\bar{X}-\bar{X}'\) and \(\widehat {K}=\bar{K}-\bar{K}'\) are given by

$$ \sup_{0\leqslant s\leqslant r}\mathbb{E}\bigl\vert \widehat {X}(s) \bigr\vert ^{2}\leqslant C_{1}\delta \mathbb{E}\biggl[ \int _{0}^{r} \vert \widehat {u}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+C_{1} \mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {Y}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2} \bigr) \biggr] $$
(5.5)

and

$$ \sup_{0\leqslant s\leqslant r}\mathbb{E}\bigl\vert \widehat {K}(s) \bigr\vert ^{2}\leqslant C_{1}\delta \mathbb{E}\biggl[ \int _{0}^{r} \vert \widehat {u}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+C_{1} \mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {Y}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {P}(s) \bigr\vert ^{2} \bigr) \biggr], $$
(5.6)

respectively, for \(0\leqslant r\leqslant T\). In the same way, for the difference of the solutions \((\widehat {X}_{0},\widehat {Z}_{0})=(\bar{X}_{0}-\bar{X}'_{0},\bar{Z}_{0}-\bar{Z}'_{0})\), \((\widehat {P},\widehat {Q})=(\bar{P}-\bar{P}',\bar{Q}-\bar{Q}')\), and \((\widehat {Y},\widehat {Z})=(\bar{Y}-\bar{Y}',\bar{Z}-\bar{Z}')\), applying the usual technique to the BSDEs, we have

$$\begin{aligned}& \mathbb{E}\biggl[ \int _{0}^{T} \bigl( \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {Z}_{0}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr] \leqslant C_{1}\delta \mathbb{E}\biggl[ \int _{0}^{T} \vert \widehat {u}_{s} \vert ^{2} \,\mathrm{d}s \biggr]+ C_{1}\mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert \widehat {Y}_{0}(s) \bigr\vert ^{2}\,\mathrm{d}s \biggr], \end{aligned}$$
(5.7)
$$\begin{aligned}& \begin{aligned} &\mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {P}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {Q}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr] \\ &\quad \leqslant C_{1}\delta \mathbb{E}\biggl[ \int _{0}^{r} \vert \widehat {u}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+ C_{1}\mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {X}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {K}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr], \end{aligned} \end{aligned}$$
(5.8)

and

$$ \begin{aligned} &\mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {Y}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {Z}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr] \\ &\quad \leqslant C_{1}\delta \mathbb{E}\biggl[ \int _{0}^{r} \vert \widehat {u}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+ C_{1}\mathbb{E}\biggl[ \int _{0}^{r} \bigl( \bigl\vert \widehat {X}_{0}(s) \bigr\vert ^{2}+ \bigl\vert \widehat {X}(s) \bigr\vert ^{2} \bigr)\,\mathrm{d}s \biggr] \end{aligned} $$
(5.9)

for \(\forall 0\leqslant r\leqslant T\). Here, the constant \(C_{1}\) depends on the coefficients of (2.1)–(2.2), \(\beta _{1}\), \(\beta _{2}\), and T. \(\gamma _{0}H_{0}+(1-\gamma _{0})\geqslant \mu \), \(\mu =\min (1,H_{0})>0\).

Under (H2), combining (5.3), (5.5)–(5.6), (5.8)–(5.9) and applying Gronwall’s inequality, we obtain

$$ \mathbb{E}\biggl[ \int _{0}^{T} \vert \widehat {U}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+\mathbb{E}\bigl\vert \widehat {X}_{0}(0) \bigr\vert ^{2}\leqslant C_{2}\delta \biggl(\mathbb{E}\int _{0}^{T} \vert \widehat {u}_{s} \vert ^{2} \,\mathrm{d}s+\mathbb{E}\bigl\vert \widehat {x}_{0}(0) \bigr\vert ^{2} \biggr), $$

where \(C_{2}\) depends on \(C_{1}\), μ, and T. Choosing \(\delta _{0}=\frac{1}{2C_{2}}\), we get that, for each fixed \(\delta \in [0,\delta _{0}]\), the mapping \(I_{\gamma _{0}+\delta }\) is a contraction in the sense that

$$ \mathbb{E}\biggl[ \int _{0}^{T} \vert \widehat {U}_{s} \vert ^{2}\,\mathrm{d}s \biggr]+\mathbb{E}\bigl\vert \widehat {X}_{0}(0) \bigr\vert ^{2}\leqslant \frac{1}{2} \biggl(\mathbb{E}\int _{0}^{T} \vert \widehat {u}_{s} \vert ^{2} \,\mathrm{d}s+\mathbb{E}\bigl\vert \widehat {x}_{0}(0) \bigr\vert ^{2} \biggr). $$

Then it follows that there exists a unique fixed point

$$ U^{\gamma _{0}+\delta }= \bigl(\bar{Y}_{0}^{\gamma _{0}+\delta }, \bar{X}^{\gamma _{0}+\delta }, \bar{K}^{\gamma _{0}+\delta },\bar{X}_{0}^{\gamma _{0}+\delta }, \bar{P}^{\gamma _{0}+\delta }, \bar{Y}^{\gamma _{0}+\delta },\bar{Z}_{0}^{\gamma _{0}+\delta }, \bar{Q}^{\gamma _{0}+\delta }, \bar{Z}^{\gamma _{0}+\delta } \bigr), $$

which is the solution of (5.1) for \(\gamma =\gamma _{0}+\delta \). Since \(\delta _{0}\) depends only on \((C_{1},\mu ,T)\), we can repeat this process N times with \(1\leqslant N\delta _{0}<1+\delta _{0}\).

Then it follows that, in particular, as \(\gamma =1\) corresponding to \(\varphi _{t}^{i}\equiv 0\), \(\lambda _{t}^{i}\equiv 0\), \(\kappa _{t}^{i}\equiv 0\), \(a=0\) (\(i=1,2,3\)), (5.1) admits a unique solution, which implies the well-posedness of (4.16). The proof is complete. □

6 ε-Nash equilibrium for Problem (I)

We characterized the decentralized strategies \(\{\bar{u}_{i}\}_{0\leqslant i\leqslant N}\) of Problem (I) through the auxiliary Problem (II) and the consistency condition system. Now, we turn to verify the ε-Nash equilibrium of these decentralized strategies. We first present the definition of ε-Nash equilibrium.

Definition 6.1

A set of controls \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\in \mathcal{U}_{0}\times \mathcal{U}_{1} \times \cdots \times \mathcal{U}_{N}\) for \((1+N)\) agents is called to satisfy an ε-Nash equilibrium with respect to the costs \((\mathcal{J}_{0},\mathcal{J}_{1},\ldots ,\mathcal{J}_{N})\) if there exists \(\varepsilon =\varepsilon (N)\geqslant 0\), \(\lim_{N\rightarrow \infty }\varepsilon (N)=0\) such that, for any fixed \(i=1,2,\ldots ,N\), we have

$$ \textstyle\begin{cases} \mathcal{J}_{0}(\bar{u}_{0},\bar{\mathbf{u}}_{-0}) \leqslant \mathcal{J}_{0}(u_{0}, \bar{\mathbf{u}}_{-0})+ \varepsilon , \\ \mathcal{J}_{i}(\bar{u}_{i},\bar{\mathbf{u}}_{-i}) \leqslant \mathcal{J}_{i}(u_{i}, \bar{\mathbf{u}}_{-i})+ \varepsilon , \end{cases} $$
(6.1)

when any alternative control \((u_{0},u_{i})\in \mathcal{U}_{0}\times \mathcal{U}_{i}\) is applied by \((\mathcal{A}_{0},\mathcal{A}_{i})\).

At first, we present the main result of this section and defer its proof in later part.

Theorem 6.2

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, then \(\{\bar{u}_{i}\}_{0\leqslant i\leqslant N}\) is an ε-Nash equilibrium of Problem (I)for the leader agent \(\mathcal{A}_{0}\) and each of the follower agents \(\mathcal{A}_{i}\), \(i=1,2,\ldots ,N\). And \(\{\bar{u}_{i}\}_{0\leqslant i\leqslant N}\) is given by

$$ \textstyle\begin{cases} \bar{u}_{0}(t)= -R_{0}^{-1}B_{0} \bar{y}_{0}(t), \\ \bar{u}_{i}(t)= -R^{-1} (B\bar{y}_{i}+D \bar{z}_{i}(t) ) \end{cases} $$
(6.2)

for \(\bar{y}_{0}(\cdot )\), \((\bar{y}_{i}(\cdot ),\bar{z}_{i}(\cdot ))\) solved by (4.16).

For the leader \(\mathcal{A}_{0}\) and the followers \(\mathcal{A}_{i}\), the decentralized states \((\bar{x}_{0}(\cdot ),\bar{z}_{0}(\cdot ))\), and \(\bar{x}_{i}(\cdot )\) are given respectively by

$$ \textstyle\begin{cases} \mathrm{d} \bar{x}_{0}(t)= \{A_{0}\bar{x}_{0}(t)-R_{0}^{-1}B_{0}^{2} \bar{y}_{0}(t)+C_{0} \bar{z}_{0}(t) \} \,\mathrm{d}t+\bar{z}_{0}(t)\,\mathrm{d}W_{0}(t), \\ \mathrm{d} \bar{x}_{i}(t)= \{A\bar{x}_{i}(t)-R^{-1}B^{2} \bar{y}_{i}(t)-R^{-1}BD \bar{z}_{i}(t)+E \bar{x}^{(N)}(t)+\alpha \bar{x}_{0}(t) \}\,\mathrm{d}t \\ \hphantom{\mathrm{d} \bar{x}_{i}(t)={}}{} + \{C\bar{x}_{i}(t)-R^{-1}BD\bar{y}_{i}(t)-R^{-1}D^{2} \bar{z}_{i}(t)+F\bar{x}^{(N)}(t)+\beta \bar{x}_{0}(t) \}\,\mathrm{d}W_{i}(t), \\ \bar{x}_{0}(T)=\xi ,\qquad \bar{x}_{i}(0)=x_{i0},\quad i=1,2,\ldots ,N, \end{cases} $$
(6.3)

where the processes \(\bar{y}_{0}(\cdot )\), \((\bar{y}_{i}(\cdot ),\bar{z}_{i}(\cdot ))\) are solved by (4.16). Let us first present several lemmas to be used later. Here, we may abuse the inner product notation \(\langle \cdot , \cdot \rangle \) with \(| \cdot |^{2}\).

Lemma 6.3

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, there exists a constant M independent of N such that

$$ \sup_{0\leqslant i\leqslant N}\mathbb{E}\Bigl[\sup_{0\leqslant t \leqslant T} \bigl\vert \bar{x}_{i}(t) \bigr\vert ^{2} \Bigr]< M. $$

Proof

From Theorems 4.1, 4.2, FBSDEs (4.11) and (4.1) have unique solutions \(((\bar{x}_{0},\bar{z}_{0}), \bar{y}_{0})\in L^{2}_{\mathcal{F}}(0,T;\mathbb{R}^{3})\) and \((\bar{x}_{i},(\bar{y}_{i},\bar{z}_{i}))\in L^{2}_{\mathcal{F}}(0,T;\mathbb{R}^{3N})\), \(1\leqslant i\leqslant N\). Thus, BFSDEs system (6.3) has also a unique solution

$$ \bigl((\bar{x}_{0},\bar{z}_{0}),\bar{x}_{1}, \ldots ,\bar{x}_{N} \bigr)\in L^{2}_{ \mathcal{F}} \bigl(0,T;\mathbb{R}^{2+N} \bigr). $$

Noticing that BFSDEs system (6.3) is weakly coupled, in fact, we can compute the BSDE part directly. So, we can easily show that there exists a constant M independent of N such that

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert \bar{x}_{0}(t) \bigr\vert ^{2} \Bigr]< M. $$

Then we turn to estimate the SDE part of (6.3). By using the BDG inequality, there exists a constant M independent of N such that, for any \(t\in [0,T]\),

$$ \begin{aligned} \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2} \Bigr]\leqslant & M+M\mathbb{E}\biggl[ \int _{0}^{t} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2}+ \bigl\vert \bar{x}_{0}(s) \bigr\vert ^{2}+ \bigl\vert \bar{x}^{(N)}(s) \bigr\vert ^{2} \vert ^{2} \,\mathrm{d}s \biggr] \\ \leqslant &M+M\mathbb{E}\Biggl[ \int _{0}^{t} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2}+ \frac{1}{N} \sum_{i=1}^{N} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2}\,\mathrm{d}s \Biggr] \end{aligned} $$

and by Gronwall’s inequality, we obtain

$$ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2} \Bigr]\leqslant M+M\mathbb{E}\Biggl[ \int _{0}^{t}\frac{1}{N}\sum _{i=1}^{N} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2}\,\mathrm{d}s \Biggr]. $$
(6.4)

Thus,

$$ \mathbb{E}\Biggl[\sup_{0\leqslant s\leqslant t}\sum_{i=1}^{N} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2} \Biggr] \leqslant \mathbb{E}\Biggl[\sum_{i=1}^{N}\sup _{0\leqslant s \leqslant t} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2} \Biggr]\leqslant M N+2M\mathbb{E}\Biggl[ \int _{0}^{t}\sum_{i=1}^{N} \bigl\vert \bar{x}_{i}(s) \bigr\vert ^{2} \Biggr]. $$

By Gronwall’s inequality, it follows that \(\mathbb{E}[\sup_{0\leqslant s\leqslant t}\sum_{i=1}^{N} |\bar{x}_{i}(s) |^{2} ]=O(N)\). By substituting this estimate to (6.4), we have \(\mathbb{E}[\sup_{0\leqslant s\leqslant t} |\bar{x}_{i}(s) |^{2} ]\leqslant M\). This completes the proof. □

Now, we recall that

$$ \bar{x}^{(N)}(t)=\frac{1}{N}\sum _{i=1}^{N}\bar{x}_{i}(t), $$

then we have the following.

Lemma 6.4

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, there exists a constant M independent of N such that

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert \bar{x}^{(N)}(t)- \bar{x}(t) \bigr\vert ^{2} \Bigr]\leqslant \frac{M}{N}. $$

Proof

In fact, we have

$$ \textstyle\begin{cases} \mathrm{d} ( \bar{x}^{(N)}-\bar{x} )=(A+E) (\bar{x}^{(N)}- \bar{x} )\,\mathrm{d}t+\frac{1}{N}\sum_{i=1}^{N}[ \cdots ]\,\mathrm{d}W_{i}(t), \\ (\bar{x}^{(N)}-\bar{x} ) (0)=0. \end{cases} $$
(6.5)

From (6.5), by using the BDG inequality and Lemma 6.3, there exists a constant M independent of N such that, for any \(t\in [0,T]\),

$$ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}^{(N)}- \bar{x} \bigr\vert ^{2}(s) \Bigr]\leqslant \frac{M}{N}+M \mathbb{E}\biggl[ \int _{0}^{t} \bigl\vert \bar{x}^{(N)}-\bar{x} \bigr\vert ^{2}(s)\,\mathrm{d}s \biggr], $$

and by Gronwall’s inequality, we obtain

$$ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}^{(N)}- \bar{x} \bigr\vert ^{2}(s) \Bigr]\leqslant \frac{M}{N}. $$

 □

Lemma 6.5

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, there exists a constant M independent of N such that

$$ \bigl\vert \mathcal{J}_{i}(\bar{u}_{i},\bar{ \mathbf{u}}_{-i})-J_{i}(\bar{u}_{i}) \bigr\vert =O \biggl(\frac{1}{\sqrt{N}} \biggr), \quad 0\leqslant i\leqslant N. $$

Proof

Let us first consider the leader agent \(\mathcal{A}_{0}\). Recalling (2.4) and (4.10), we have

$$ \begin{aligned} &\mathcal{J}_{0}( \bar{u}_{0},\bar{\mathbf{u}}_{-0})-J_{0}( \bar{u}_{0}) \\ &\quad =\frac{1}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl( \bar{x}_{0}(t)- \bar{x}^{(N)}(t) \bigr)^{2}-Q_{0} \bigl(\bar{x}_{0}(t)-\bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} \\ &\quad =\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl( \bar{x}_{0}(t)-\bar{x}(t) \bigr) \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\qquad {}+ \frac{1}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl( \bar{x}^{(N)}(t)- \bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} . \end{aligned}$$
(6.6)

By Hölder’s inequality and Lemma 6.3, there exists a constant M independent of N such that

$$ \begin{aligned} &\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl( \bar{x}_{0}(t)-\bar{x}(t) \bigr) \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\quad \leqslant \mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert \bar{x}_{0}(t)- \bar{x}(t) \bigr\vert ^{2}\,\mathrm{d}t \biggr\} ^{\frac{1}{2}}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl(\bar{x}^{(N)}(t)-\bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{ \frac{1}{2}} \\ &\quad \leqslant M\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}}. \end{aligned}$$
(6.7)

Noting (6.6), (6.7) and Lemma 6.4, there exists a constant M independent of N such that

$$ \begin{aligned} &\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl(\bar{x}^{(N)}(t)-\bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \\ &\quad \leqslant \biggl\{ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}^{(N)}-\bar{x} \bigr\vert ^{2}(s) \Bigr] \int _{0}^{T} \vert Q_{0} \vert ^{2}\,\mathrm{d}t \biggr\} ^{\frac{1}{2}}\leqslant \frac{M}{\sqrt{N}}=O \biggl( \frac{1}{\sqrt{N}} \biggr). \end{aligned} $$
(6.8)

The remaining claims of the followers can be proved in the same way. □

Remark 6.6

We denote by M the common constant of different bounds. In the above lemmas, the constant M may vary line by line but it is always independent of the number of follower agents N.

6.1 Leader agent’s perturbation

In this subsection, we prove that the control strategies set \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\) given by Theorem 6.2 is an ε-Nash equilibrium of Problem (I) for the leader agent \(\mathcal{A}_{0}\), i.e., there exists \(\varepsilon =\varepsilon (N)\geqslant 0\), \(\lim_{N\rightarrow \infty }\varepsilon (N)=0\) such that

$$ \mathcal{J}_{0}(\bar{u}_{0},\bar{\mathbf{u}}_{-0}) \leqslant \mathcal{J}_{0}(u_{0}, \bar{ \mathbf{u}}_{-0})+ \varepsilon , \quad \forall u_{0}\in \mathcal{U}_{0}[0,T]. $$

Let us consider that the leader agent \(\mathcal{A}_{0}\) applies an alternative strategy \(u_{0}\) and each follower agent \(\mathcal{A}_{i}\) uses the control \(\bar{u}_{i}(t)=-R^{-1}(B\bar{y}_{i}+D\bar{z}_{i}(t))\). To prove that \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\) is an ε-Nash equilibrium for the leader agent, we need to show that for possible alternative control \(u_{0}\), \(\inf_{u_{0}\in \mathcal{U}_{0}[0,T]}\mathcal{J}_{0}(u_{0},\bar{\mathbf{u}}_{-0}) \geqslant \mathcal{J}_{0}(\bar{u}_{0},\bar{\mathbf{u}}_{-0})-\varepsilon \). Then we only need to consider the perturbation \(u_{0}\in \mathcal{U}_{0}[0,T]\) such that \(\mathcal{J}_{0}(u_{0},\bar{\mathbf{u}}_{-0})\leqslant \mathcal{J}_{0}(\bar{u}_{0}, \bar{\mathbf{u}}_{-0})\). By the representation of a cost functional in [21, 28], we can give the representation of a cost functional as follows.

Proposition 6.7

Let (H1)(H2) hold. There exist a bounded self-adjoint linear operator \(N_{0}:\mathcal{U}_{0}[0,T]\rightarrow \mathcal{U}_{0}[0,T]\), a bounded linear operator \(N_{1}:\mathbb{R}\rightarrow \mathcal{U}_{0}[0,T]\), a bounded real-valued function \(N_{2}:\mathbb{R}\rightarrow \mathbb{R}\) such that

$$\begin{aligned}& \mathcal{J}_{0} \bigl(\xi ;u_{0},\bar{ \mathbf{u}}_{-0}[u_{0}] \bigr)=\frac{1}{2} \bigl\{ \bigl\langle N_{0}u_{0}(\cdot ),u_{0}( \cdot ) \bigr\rangle +2 \bigl\langle N_{1}( \xi ),u_{0}(\cdot ) \bigr\rangle +N_{2}(\xi ) \bigr\} , \\& \quad \forall (\xi ,u_{0}) \in \mathbb{R}\times \mathcal{U}_{0}[0,T]. \end{aligned}$$

Proof

Refer to Proposition 3.1 in [21]. □

So, if we have that \(N_{0}\gg 0\) from Lemma 6.5, then there exists a constant \(c>0\) such that

$$\begin{aligned}& \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert N_{0}^{\frac{1}{2}}u_{0}(t)+N_{0}^{- \frac{1}{2}}N_{1}( \xi ) \bigr\vert ^{2}\,\mathrm{d}t \biggr] \\& \quad \leqslant \mathcal{J}_{0}(u_{0}, \bar{\mathbf{u}}_{-0})+c \leqslant \mathcal{J}_{0}(\bar{u}_{0}, \bar{ \mathbf{u}}_{-0})+c\leqslant J_{0}(\bar{u}_{0})+c+O \biggl( \frac{1}{\sqrt{N}} \biggr), \end{aligned}$$

which implies that \(\mathbb{E}[\int _{0}^{T} |u_{0}(t) |^{2}\,\mathrm{d}t ]\leqslant M\), where M is a constant independent of N. In fact, by the bounded inverse theorem, \(N_{0}^{-1}\) is bounded, so there exists a constant \(0<\gamma \leqslant \|N_{0}^{\frac{1}{2}}\|\) such that

$$ \gamma \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert u_{0}(t) \bigr\vert ^{2}\,\mathrm{d}t \biggr] \leqslant \bigl\Vert N_{0}^{\frac{1}{2}} \bigr\Vert \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert u_{0}(t)+N_{0}^{-1}N_{1}( \xi ) \bigr\vert ^{2}\,\mathrm{d}t \biggr]\leqslant J_{0}( \bar{u}_{0})+c+O \biggl( \frac{1}{\sqrt{N}} \biggr). $$

Then we have \(\mathbb{E}[\int _{0}^{T} |u_{0}(t) |^{2}\,\mathrm{d}t ]\leqslant M\). Similar to Lemma 6.3, we can show that

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert x_{0}(t) \bigr\vert ^{2} \Bigr] \leqslant M. $$
(6.9)

Remark 6.8

Here, in fact, we have \(N_{0}=R_{0}\) which is assumed to be a positive number. So we clearly have the result of (6.9). If we have to deal with a more complicated cost functional, we may use the representation of the cost functional in [21, 28]. But in this paper, we can avoid this tool actually, and we just provide a method in case the problem is not so clear.

Lemma 6.9

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, for the leader agent’s perturbation control \(u_{0}\), we have

$$ \bigl\vert \mathcal{J}_{0}(u_{0},\bar{ \mathbf{u}}_{-0})-J_{0}(u_{0}) \bigr\vert =O \biggl( \frac{1}{\sqrt{N}} \biggr). $$

Proof

Recall (2.4) and (4.10), we have

$$ \begin{aligned} &\mathcal{J}_{0}(u_{0}, \bar{\mathbf{u}}_{-0})-J_{0}(u_{0}) \\ &\quad =\frac{1}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x_{0}(t)-\bar{x}^{(N)}(t) \bigr)^{2}-Q_{0} \bigl(x_{0}(t)-\bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} \\ &\quad =\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x_{0}(t)-\bar{x}(t) \bigr) \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\qquad {}+\frac{1}{2} \mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl( \bar{x}^{(N)}(t)-\bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} . \end{aligned}$$
(6.10)

By Hölder’s inequality and (6.9), there exists a constant M independent of N such that

$$ \begin{aligned} &\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x_{0}(t)-\bar{x}(t) \bigr) \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\quad \leqslant \mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert x_{0}(t)- \bar{x}(t) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl( \bar{x}^{(N)}(t)-\bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \\ &\quad \leqslant M\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl(\bar{x}^{(N)}(t)- \bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}}. \end{aligned}$$
(6.11)

At last, same as Lemma 6.5, noting (6.10), (6.11), and Lemma 6.4, there exists a constant M independent of N such that

$$ \begin{aligned} &\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q_{0} \bigl(\bar{x}^{(N)}(t)-\bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \\ &\quad \leqslant \biggl\{ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert \bar{x}^{(N)}-\bar{x} \bigr\vert ^{2}(s) \Bigr] \int _{0}^{T} \vert Q_{0} \vert ^{2}\,\mathrm{d}t \biggr\} ^{\frac{1}{2}}\leqslant \frac{M}{\sqrt{N}}=O \biggl( \frac{1}{\sqrt{N}} \biggr). \end{aligned} $$
(6.12)

 □

Then, applying Lemmas 6.5 and 6.9, we can give the first part of the proof of Theorem 6.2, i.e., the control strategies set \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\) given by Theorem 6.2 is an ε-Nash equilibrium of Problem (I) for the leader agent.

Part A of the proof to Theorem 6.2

Combining Lemmas 6.5 and 6.9, we have

$$ \mathcal{J}_{0}(\bar{u}_{0},\bar{\mathbf{u}}_{-0}) \leqslant J_{0}(\bar{u}_{0})+O \biggl( \frac{1}{\sqrt{N}} \biggr)\leqslant J_{0}(u_{0})+O \biggl( \frac{1}{\sqrt{N}} \biggr)\leqslant \mathcal{J}_{0}(u_{0}, \bar{\mathbf{u}}_{-0})+O \biggl(\frac{1}{\sqrt{N}} \biggr), $$

where the second inequality comes from the fact that \(J_{0}(\bar{u}_{0})=\inf_{u_{0}\in \mathcal{U}_{0}[0,T]}J_{0}(u_{0})\). Consequently, Theorem 6.2 holds for the major leader agent with \(\varepsilon =O (\frac{1}{\sqrt{N}} )\). □

6.2 Follower agent’s perturbation

Now, let us consider the following perturbation: a given follower agent \(\mathcal{A}_{i}\) uses an alternative strategy \(u_{i}\in \mathcal{U}_{i}[0,T]\), the leader agent \(\mathcal{A}_{0}\) uses \(\bar{u}_{0}\). In fact, same as the argument of the leader agent part, to prove \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\) is an ε-Nash equilibrium for each follower agents, we only need to consider the perturbation \(u_{i}\in \mathcal{U}_{i}[0,T]\) satisfying

$$ \mathbb{E}\biggl[ \int _{0}^{T} \bigl\vert u_{i}(t) \bigr\vert ^{2}\,\mathrm{d}t \biggr]\leqslant M, $$

where M is a constant independent of N. Then, similar to Lemma 6.3, we can show that

$$ \sup_{1\leqslant i\leqslant N}\mathbb{E}\Bigl[\sup _{0\leqslant t \leqslant T} \bigl\vert x_{i}(t) \bigr\vert ^{2} \Bigr]\leqslant M. $$
(6.13)

Lemma 6.10

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, there exists a constant M independent of N such that

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert x^{(i,N)}(t)- \bar{x}(t) \bigr\vert ^{2} \Bigr]\leqslant \frac{M}{N}, $$

where \(x^{(i,N)}(t)=\frac{1}{N} (x_{i}(t)+\sum_{k\neq i}\bar{x}_{k}(t) )\).

Proof

In fact, we have

$$ x^{(i,N)}(t)-\bar{x}^{(N)}(t)=\frac{1}{N}x_{i}(t), $$

by (6.13), it yields

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert x^{(i,N)}(t)- \bar{x}^{(N)}(t) \bigr\vert ^{2} \Bigr]\leqslant \frac{M}{N}. $$

Combined with Lemma 6.4, we can directly get

$$ \mathbb{E}\Bigl[\sup_{0\leqslant t\leqslant T} \bigl\vert x^{(i,N)}(t)- \bar{x}(t) \bigr\vert ^{2} \Bigr]\leqslant \frac{M}{N}. $$

 □

Lemma 6.11

Under assumptions (H1)(H2) and those of Theorems 4.1, 4.2, for the follower agent’s perturbation control \(u_{i}\), we have

$$ \bigl\vert \mathcal{J}_{i}(u_{i},\bar{ \mathbf{u}}_{-i})-J_{i}(u_{i}) \bigr\vert =O \biggl( \frac{1}{\sqrt{N}} \biggr). $$

Proof

Recall (2.5) and (3.2), we have

$$ \begin{aligned} &\mathcal{J}_{i}(u_{i}, \bar{\mathbf{u}}_{-i})-J_{i}(u_{i}) \\ &\quad =\frac{1}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q \bigl(x_{i}(t)-x^{(i,N)}(t) \bigr)^{2}-Q \bigl(x_{i}(t)-\bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} \\ &\quad =\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q \bigl(x_{i}(t)- \bar{x}(t) \bigr) \bigl(x^{(i,N)}(t)-\bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\qquad {}+\frac{1}{2}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q_{0} \bigl(x^{(i,N)}(t)-\bar{x}(t) \bigr)^{2} \bigr]\,\mathrm{d}t \biggr\} . \end{aligned}$$
(6.14)

By the same technique, applying Hölder’s inequality, Lemma 6.10, and (6.13), there exists a constant M independent of N such that

$$ \begin{aligned} &\mathbb{E}\biggl\{ \int _{0}^{T} \bigl[Q \bigl(x_{i}(t)- \bar{x}(t) \bigr) \bigl(x^{(i,N)}(t)- \bar{x}(t) \bigr) \bigr]\,\mathrm{d}t \biggr\} \\ &\quad \leqslant \mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert x_{i}(t)- \bar{x}(t) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}}\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q \bigl(x^{(i,N)}(t)- \bar{x}(t) \bigr) \bigr\vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \\ &\quad \leqslant M\mathbb{E}\biggl\{ \int _{0}^{T} \bigl\vert Q \bigl(x^{(i,N)}(t)-\bar{x}(t) \bigr) \bigr\vert ^{2}\,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \\ &\quad \leqslant \biggl\{ \mathbb{E}\Bigl[\sup_{0\leqslant s\leqslant t} \bigl\vert x^{(i,N)}- \bar{x} \bigr\vert ^{2}(s) \Bigr] \int _{0}^{T} \vert Q \vert ^{2} \,\mathrm{d}t \biggr\} ^{\frac{1}{2}} \leqslant \frac{M}{\sqrt{N}}=O \biggl( \frac{1}{\sqrt{N}} \biggr). \end{aligned} $$
(6.15)

 □

Taking the advantage of Lemmas 6.5 and 6.11, we can give the second part of the proof to Theorem 6.2, i.e., the control strategies set \((\bar{u}_{0},\bar{u}_{1},\ldots ,\bar{u}_{N})\) given by Theorem 6.2 is an ε-Nash equilibrium of Problem (I) for each of the follower agents.

Part B of the proof to Theorem 6.2

Combining Lemmas 6.5 and 6.11, we have

$$ \mathcal{J}_{i}(\bar{u}_{i},\bar{\mathbf{u}}_{-i}) \leqslant J_{i}(\bar{u}_{i})+O \biggl( \frac{1}{\sqrt{N}} \biggr)\leqslant J_{i}(u_{i})+O \biggl( \frac{1}{\sqrt{N}} \biggr)\leqslant \mathcal{J}_{i}(u_{i}, \bar{\mathbf{u}}_{-i})+O \biggl(\frac{1}{\sqrt{N}} \biggr), $$

where the second inequality comes from the fact that \(J_{i}(\bar{u}_{i})=\inf_{u_{i}\in \mathcal{U}_{i}[0,T]}J_{i}(u_{i})\). Consequently, Theorem 6.2 holds for each of the follower agents with \(\varepsilon =O (\frac{1}{\sqrt{N}} )\). Finally, combined with Part A, we complete the proof to Theorem 6.2. □

Remark 6.12

So far, we have solved the optimal strategy from the BFSDE, but in this case, we cannot introduce a kind of Riccati equation to decouple the equation. Then we may consider how to apply the results in reality. Fortunately, there are lots of existing methods helping us to do some explicit computation.

In the fields about numerical algorithms and simulations for BSDEs, Peng and Xu [20] studied the convergence results of an explicit scheme based on approximating Brownian motion by random walk, which is efficient in programming, and they developed a software package based on this algorithm for BSDEs. Recently, the authors Sun, Zhao, and Zhou [22] proposed an explicit θ-scheme for MF-BSDEs, and we can get more results about MF-FBSDE simulations and numerical methods from other literature works of them.

Another common method to compute the solution of FBSDEs is computing the related partial differential equations (PDEs), and one of the most famous methods is the four step scheme introduced by Ma, Protter, and Yong [17]. By virtue of the quasilinear parabolic PDE, the adapted solution can always be sought under some conditions. We can refer to Chap. 9 of [28] to get more details about these numerical methods.

Availability of data and materials

Not applicable.

Abbreviations

BSDE:

Backward stochastic differential equation

BFSDE:

Backward-forward stochastic differential equation

CC:

Consistency condition

CL:

Closed loop

FBSDE:

Forward-backward stochastic differential equation

LQ:

Linear quadratic

LQG:

Linear-quadratic-Gaussian

MFG:

Mean-field game

OL:

Open loop

PDE:

Partial differential equation

SDE:

Stochastic differential equation

References

  1. Bardi, M.: Explicit solutions of some linear-quadratic mean field games. Netw. Heterog. Media 7(2), 243–261 (2012)

    Article  MathSciNet  Google Scholar 

  2. Basar, T.: Stochastic stagewise Stackelberg strategies for linear quadratic systems. In: Stochastic Control Theory and Stochastic Differential Systems, pp. 264–276. Springer, Berlin (1979)

    Chapter  Google Scholar 

  3. Bensoussan, A., Chen, S., Sethi, S.: The maximum principle for global solutions of stochastic Stackelberg differential games. SIAM J. Control Optim. 53(4), 1956–1981 (2015)

    Article  MathSciNet  Google Scholar 

  4. Bensoussan, A., Frehse, J., Yam, P.: Mean Field Games and Mean Field Type Control Theory. Springerbriefs in Mathematics, vol. 101. Springer, New York (2013)

    Book  Google Scholar 

  5. Bismut, J.: An introductory approach to duality in optimal stochastic control. SIAM Rev. 20(1), 62–78 (1978)

    Article  MathSciNet  Google Scholar 

  6. Carmona, R., Delarue, F.: Probabilistic analysis of mean-field games. SIAM J. Control Optim. 51(4), 2705–2734 (2013)

    Article  MathSciNet  Google Scholar 

  7. Du, K., Wu, Z.: Linear-quadratic Stackelberg game for mean-field backward stochastic differential system and application. Math. Probl. Eng. 2019, Article ID 1798585 (2019)

    MathSciNet  MATH  Google Scholar 

  8. Garnier, J., Papanicolaou, G., Yang, T.: Large deviations for a mean field model of systemic risk. SIAM J. Financ. Math. 4(1), 151–184 (2013)

    Article  MathSciNet  Google Scholar 

  9. Guéant, O., Lasry, J., Lions, P.: Mean field games and applications. In: Paris–Princeton Lectures on Mathematical Finance 2010, pp. 205–266. Springer, Berlin (2011)

    Chapter  Google Scholar 

  10. Huang, J., Wang, S., Wu, Z.: Backward-forward linear-quadratic mean-field games with major and minor agents. Probab. Uncertain. Quant. Risk 1(8), 1–27 (2016)

    MathSciNet  MATH  Google Scholar 

  11. Huang, M., Caines, P., Malhamé, R.: Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions. In: Proceedings of the 42nd IEEE Conference on Decision and Control, IEEE, vol. 1, pp. 98–103 (2003)

    Google Scholar 

  12. Huang, M., Malhamé, R., Caines, P.: Large population stochastic dynamic games: closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6(3), 221–252 (2006)

    MathSciNet  MATH  Google Scholar 

  13. Lasry, J., Lions, P.: Jeux à champ moyen. I—Le cas stationnaire. C. R. Math. 343(9), 619–625 (2006)

    Article  MathSciNet  Google Scholar 

  14. Lasry, J., Lions, P.: Jeux à champ moyen. II—Horizon fini et contrôle optimal. C. R. Math. 343(10), 679–684 (2006)

    Article  MathSciNet  Google Scholar 

  15. Lasry, J., Lions, P.: Mean field games. Jpn. J. Math. 2(1), 229–260 (2007)

    Article  MathSciNet  Google Scholar 

  16. Lim, E., Zhou, X.: Linear-quadratic control of backward stochastic differential equations. SIAM J. Control Optim. 40(2), 450–474 (2001)

    Article  MathSciNet  Google Scholar 

  17. Ma, J., Protter, P., Yong, J.: Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab. Theory Relat. Fields 98(3), 339–359 (1994)

    Article  MathSciNet  Google Scholar 

  18. Pardoux, E., Peng, S.: Adapted solution of a backward stochastic differential equation. Syst. Control Lett. 14(1), 55–61 (1990)

    Article  MathSciNet  Google Scholar 

  19. Peng, S., Wu, Z.: Fully coupled forward-backward stochastic differential equations and applications to optimal control. SIAM J. Control Optim. 37(3), 825–843 (1999)

    Article  MathSciNet  Google Scholar 

  20. Peng, S., Xu, M.: Numerical algorithms for backward stochastic differential equations with 1-D Brownian motion: convergence and simulations. ESAIM: Math. Model. Numer. Anal. 45(2), 335–360 (2011)

    Article  MathSciNet  Google Scholar 

  21. Sun, J., Li, X., Yong, J.: Open-loop and closed-loop solvabilities for stochastic linear quadratic optimal control problems. SIAM J. Control Optim. 54(5), 2274–2308 (2016)

    Article  MathSciNet  Google Scholar 

  22. Sun, Y., Zhao, W., Zhou, T.: Explicit θ-schemes for mean-field backward stochastic differential equation. SIAM J. Numer. Anal. 56(4), 2672–2697 (2018)

    Article  MathSciNet  Google Scholar 

  23. Von Stackelberg, H.: Marktform und Gleichgewicht. Springer, Berlin (1934). An English translation appeared in The Theory of the Market Economy. Oxford University Press, Oxford (1952)

    MATH  Google Scholar 

  24. Wang, G., Wu, Z.: The maximum principles for stochastic recursive optimal control problems under partial information. IEEE Trans. Autom. Control 54(6), 1230–1242 (2009)

    Article  MathSciNet  Google Scholar 

  25. Yong, J.: Linear forward-backward stochastic differential equations. Appl. Math. Optim. 39(1), 93–119 (1999)

    Article  MathSciNet  Google Scholar 

  26. Yong, J.: A leader-follower stochastic linear quadratic differential game. SIAM J. Control Optim. 41(4), 1015–1041 (2002)

    Article  MathSciNet  Google Scholar 

  27. Yong, J.: Forward-backward stochastic differential equations with mixed initial-terminal conditions. Trans. Am. Math. Soc. 362(2), 1047–1096 (2010)

    Article  MathSciNet  Google Scholar 

  28. Yong, J., Zhou, X.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Stochastic Modelling and Applied Probability, vol. 43. Springer, New York (1999)

    Book  Google Scholar 

Download references

Acknowledgements

The first author—K. Si acknowledges the financial support from the Shandong University, and the present work constitutes a part of his work for his postgraduate dissertation. The corresponding author—Z. Wu acknowledges the financial support by NSFC (11831010, 61961160732) and Shandong Provincial Natural Science Foundation (ZR2019ZD42).

Funding

Z. Wu acknowledges the financial support by NSFC (11831010, 61961160732) and Shandong Provincial Natural Science Foundation (ZR2019ZD42).

Author information

Authors and Affiliations

Authors

Contributions

ZW carried out the problem and gave the instructions while writing the paper. KS deduced the mathematical computation and theorems involved and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhen Wu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Si, K., Wu, Z. Backward-forward linear-quadratic mean-field Stackelberg games. Adv Differ Equ 2021, 73 (2021). https://doi.org/10.1186/s13662-021-03236-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13662-021-03236-9

MSC

Keywords