Next Article in Journal
Boltzmann Distributed Replicator Dynamics: Population Games in a Microgrid Context
Previous Article in Journal
Parties’ Preferences for Office and Policy Goals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantum Mean-Field Games with the Observations of Counting Type

1
Department of Statistics, University of Warwick, Coventry CV4 7AL, UK
2
Higher School of Economics, 109028 Moscow, Russia
Submission received: 4 November 2020 / Revised: 28 December 2020 / Accepted: 6 January 2021 / Published: 14 January 2021

Abstract

:
Quantum games and mean-field games (MFG) represent two important new branches of game theory. In a recent paper the author developed quantum MFGs merging these two branches. These quantum MFGs were based on the theory of continuous quantum observations and filtering of diffusive type. In the present paper we develop the analogous quantum MFG theory based on continuous quantum observations and filtering of counting type. However, proving existence and uniqueness of the solutions for resulting limiting forward-backward system based on jump-type processes on manifolds seems to be more complicated than for diffusions. In this paper we only prove that if a solution exists, then it gives an ϵ -Nash equilibrium for the corresponding N-player quantum game. The existence of solutions is suggested as an interesting open problem.

1. Introduction

In [1], two recently developed branches of game theory, quantum games and mean field games (MFGs), were merged, creating quantum MFGs. MFGs represent a very popular recent development in game theory. It was initiated in [2,3]. For recent developments one can consult monographs [4,5,6,7] and numerous references therein. Quantum games were initiated by Meyer [8], Eisert, Wilkens and Lewenstein [9], and Marinatto and Weber [10], and were dealt with afterwards in numerous publications, see, e.g., surveys [11,12], and Chapter 13 of textbook [13].
Using approaches from [9,10], one can transform any game to a new quantum version. This transformation modifies in a systematic way all properties of the games: equilibria, their stability, etc. For instance, stability of the equilibria of the transformed Replicator Dynamics for two-player two-action games was analyzed in [14]. ESS (evolutionary stable strategies) for the transformed Rock-Paper-Scissors game was analyzed in [15], and for 3 player games in [16]. The transformations of the simplest cooperative games were analyzed in [17]. In [18] the EWL (Eisert, Wilkens and Lewenstein) protocol was applied to the Battle of Sexes, in [19] to the general prisoner’s dilemma and in [20] to the three player quantum Prisoner’s dilemma. Peculiar behavior and remarkable phase transitions were found. The extension of EWL protocol for games with continuous strategy space was suggested in [21].
For application of related quantum concepts (including quantum probability) to cognitive sciences we refer to [22,23] and references therein.
The main accent in all these developments was made on stationary or repeated games, see, e.g., [24,25] for the latter, and [26] for their interpretation in economics. Not only for games, but generally for quantum control the main stream of quantum control research is based on open loop controls, with a rare appearance of a feedback control, see, e.g., [27] and [28].
The present paper initiates the study of the truly dynamic theory with observations of counting type and with the strategies chosen by players in real time. Since direct continuous observations are known to destroy quantum evolutions (so-called quantum Zeno paradox) the necessary new ingredient for quantum dynamic games must be the theory of non-direct observations and the corresponding quantum filtering. This theory is usually performed in two forms: diffusive (or homodyne) type and counting type. In paper [1] the author developed quantum MFGs based on diffusive type filtering. In the present paper quantum MFGs are built for counting type quantum observations and filtering.
As a part of the construction we show that the limiting behavior of mean field interacting controlled quantum particles (or N-player quantum game) can be described by certain classical MFG forward-backward system of jump-type equations on manifolds, the forward part being given by a new kind of nonlinear jump-type stochastic Schrödinger equations. One of the objectives of the paper is to draw the attention of game theorists to this type of games and this type of forward-backward systems, which were not studied before, and no results even on the existence of solutions are available. These objects are fully classical, but represent the limit of quantum games.
The main result states that any solution to this forward-backward system represents an approximate N 1 / 4 -Nash equilibrium for the initial N-player dynamic quantum game.
The content of the paper is as follows. In the next section we recall the basic theory of quantum continuous measurement and filtering. In Section 3, as a warm-up, we discuss briefly an example of a two-player quantum dynamic game on a qubit with observation and feedback control of counting type. In Section 4 the new nonlinear equations are introduced for the case of controlled counting detection and the convergence of N-particle observed quantum evolutions to the decoupled system of these equations is obtained, together with explicit rates of convergence. In Section 5 the MFG limits for quantum N-player games are introduced and it is proven that solutions for the limiting MFG equations specify ϵ -Nash equilibria for N-player quantum game, with ϵ of order N 1 / 4 . The limiting MFGs can be also looked at as classical MFGs, though complex-valued and evolving in infinite-dimensional manifolds. In the final section we state the problem of existence of the solutions, even in the simplest case of the control problem on a qubit.

2. Quantum Filtering of Counting Type

The general theory of quantum non-demolition observation, filtering and resulting feedback control was built essentially in papers [29,30,31]. For alternative simplified derivations of the main filtering equations given below (by-passing the heavy theory of quantum filtering) we refer to [32,33,34,35,36] and references therein. For the technical side of organising feedback quantum control in real time, see, e.g., [37,38,39].
We shall describe briefly the main result of this theory.
The non-demolition measurement of quantum systems can be organised in two versions: photon counting and homodyne detection. As was stressed above, here we shall deal only with counting measurements. In this case the main equation of quantum filtering takes the form
d γ t = i [ H , γ t ] d t + j ( 1 2 { L j L j , γ t } + tr ( L j γ t L j ) γ t ) d t + j L j γ t L j tr ( L j γ t L j ) γ t d N t j ,
in terms of the density matrices γ t , where H is the Hamiltonian of the free (not observed) motion of a quantum system, the operators { L j } define the coupling of the system with the measurement devices, and the (counting) observed Poisson processes N t j are independent and have the position dependent intensities tr ( L j L j γ t ) , so that the compensated processes M t j = N t j 0 t tr ( L j L j γ s ) d s are martingales. By { A , B } we denote the anticommutator of two operators: { A , B } = A B + B A . In terms of the compensated processes M t j Equation (1) rewrites as
d γ t = i [ H , γ t ] d t + j ( L j γ L j 1 2 { L j L j , γ t } ) d t + j L j γ t L j tr ( L j γ t L j ) γ t d M t j .
In this paper we shall deal only with the simplest case when the operators L are unitary. In this case d M t j = d N t j d t and Equations (1) and (2) become linear and take the form
d γ t = i [ H , γ t ] d t + j L j γ t L j γ t d N t j = i [ H , γ t ] d t + j L j γ t L j γ t ( d M t j + d t ) .
This dynamics preserves the set of pure states. Namely, if ϕ satisfies the equation
d ϕ t = i H ϕ t d t + j ( L j 1 ) ϕ t d N t j = i H ϕ t d t + j ( L j 1 ) ϕ t ( d M t j + d t ) ,
then γ t = ϕ t ϕ ¯ t satisfies Equation (3).
The theory of quantum filtering reduces the analysis of quantum dynamic control and games to the controlled version of evolutions (1). Two types of control can be naturally considered (see [40]). The players can control the Hamiltonian H, say, by applying appropriate electric or magnetic fields to the atom, or the coupling operators L j . Thus (3) extends to the equation
d γ t = i [ H + u H ^ , γ t ] d t + j L j ( v ) γ t L j ( v ) γ t d N t j = i [ H + u H ^ , γ t ] d t + j L j ( v ) γ t L j ( v ) γ t ( d M t j + d t ) ,
with some self-adjoint H ^ , control u and a family of unitary operators L ( v ) depending on a control parameter v.
It is seen from Equation (5) that its evolution preserves traces of matrices. One can also show that these evolutions preserve positivity of matrices γ (see, e.g., [36]).

3. Example of a Quantum Dynamic Two-Player Game

Let us stress again that the whole physics of quantum dynamic games with a feedback control of a finite number of players is incorporated into the stochastic filtering Equation (1), so that the quantum dynamic games are reduced to the stochastic games with jumps governed by this equation with operators H and L that may depend on control. As a warm-up before the mean-field setting let us consider the simple example of a zero-sum quantum dynamic two-player game on a qubit, where a complete analytic solution can be found.
Working with a qubit means that the Hilbert space of the quantum system is two dimensional. Let L be fixed and the Hamiltonian be the sum of two parts, controlled by the first and the second player respectively. Stochastic filtering Equation (1) simplifies to the equation (omitting index t)
d ψ = i H ψ d t + ( L ψ ψ ) d N t , H = u H 1 + v H 2 ,
u , v being control parameters of players I and II. Assume u [ U , U ] , v [ V , V ] with some positive U , V . Moreover, ψ has only two coordinates: ψ = ( ψ 0 , ψ 1 ) . Using Ito’s rule d N t d N t = d N t we find the equation for ψ 0 1 :
d ψ 0 1 = i ψ 0 2 ( H ψ ) 0 d t ( L ψ ) 0 ψ 0 ψ 0 ( L ψ ) 0 d N t .
Consequently, again by Ito’s rule, we find the equation for w = ψ 1 / ψ 0 :
d w = i [ w ( H W ) 0 ( H W ) 1 ] d t + [ ( L W ) 1 w ( L W ) 0 ] d N t ,
where W = ( w 0 , w 1 ) = ( 1 , w ) .
Let us choose the simplest possible L: L = σ 3 —the third Pauli matrix (diagonal with diagonal elements 1 and 1 ). Then Equation (7) simplifies to
d w = i [ w ( H ^ W ) 0 ( H ^ W ) 1 ] d t 2 w d N t .
The payoffs in quantum setting are given by certain operators, that is, they have the form
P ( t , W ; u ( . ) , v ( . ) ) = t T ( ψ s , J ψ s ) d s + ( ψ T , F ψ T ) ,
where J and F are some self-adjoint operators. They may depend on the control parameters, but we shall look for the case when they do not. In terms of w this payoff rewrites as
P ( t , W ; u ( . ) , v ( . ) ) = t T ( W s , J W s ) 1 + | w s | 2 d s + ( W T , F W T ) 1 + | w T | 2 .
Thus the zero-sum quantum dynamic two-player game (with a feedback control) with a fixed horizon T in this setting is the stochastic dynamic game with the state space C , with the evolution described by the jump-type stochastic Equation (8) and payoff (10). The aim of the first player is to maximise the expectation of (10) using an appropriate feedback strategies u ( . ) = u ( t , W t ) . The second player tries to minimise it using an appropriate feedback strategies v ( . ) = v ( t , W t ) .
The remarkable feature of this game is that the possible jumps are only of type w w . Consequently, in the coordinates r = x 2 + y 2 and ξ = y / x (where w = x + i y ), the dynamics is deterministic. Therefore, if the operators J and F of current and terminal payoffs are invariant under the transformation w w , the game can be reduced to a deterministic differential game. This game is still very complicated.
Let us consider now the most trivial example of commuting operators H 1 and H 2 controlled by two players. To be concrete, let us chose H 1 diagonal with diagonal elements 1 and 0, and H 2 diagonal with elements 0 and 1. Then Equation (8) becomes linear in w:
d w = i [ ( u v ) w ] d t 2 w d N t ,
and then the modulus ρ = | w | 2 becomes the integral of motion: d ( | w | 2 ) = 0 . Choosing ρ = 1 for definiteness we get the equation for the angle ϕ on the circle ρ = 1 :
d cos ϕ = ( u v ) sin ϕ d t 2 cos ϕ d N t .
If J and F are invariant under the transformation w w , we can identify points when cos ϕ differ only by a sign (so that possible jumps cos ϕ cos ϕ become irrelevant), and the evolution on a circle, given by the set ϕ [ π / 2 , π / 2 ] with identified endpoints, becomes deterministic:
d d t cos ϕ = ( u v ) sin ϕ ϕ ˙ = ( u v ) ,
that is a simple rotation. Choosing F = 0 and the simplest nontrivial J with zero diagonal elements and real numbers j as non-diagonal terms. The payoff (10) for ρ = | w | = 1 simplifies to
P ( t , W ; u ( . ) , v ( . ) ) = j t T cos ϕ s d s .
The HJB-Isaacs equation takes the form
S t + max u u S ϕ + min v v S ϕ + j cos ϕ = 0 .
Assuming for definiteness that U > V , so that the first player has an edge in this game, the equation rewrites as
S t + ( U V ) S ϕ + j cos ϕ = 0 .
This is HJB of a pure maximisation problem. It can be solved via the method of viscosity solutions. For instance, let us find a stationary solution describing the average winning of the first player per unit of time in a long lasting game. For this one searches for a solution to (15) in the form S = λ ( T t ) + S 0 ( ϕ ) with a constant λ . Then S 0 ( ϕ ) (obviously defined up to a constant multiplier, so that we can set S 0 ( 0 ) = 0 ) satisfies the equation
λ + ( U V ) S 0 ϕ + j cos ϕ = 0 .
To guess the right solution one can derive from the meaning of this equation that S 0 must be an even function of ϕ with maximum at ϕ = 0 , decreasing on [ 0 , π / 2 ] . Hence ( S 0 / ϕ ) ( 0 ) = 0 and thus λ = j and Equation (16) on [ 0 , π / 2 ] becomes
( U V ) S 0 ϕ = j ( 1 cos ϕ ) ,
so that
S 0 ( ϕ ) = j U V ( ϕ sin ϕ ) .
This function (considered as periodically continued with period π to the whole line) is smooth outside points ( 2 k + 1 ) π / 2 , where it has convex kinks. Hence this is really the viscosity solution to (16) confirming that our educated guess above was correct and that λ = j is the income per unit of time to the first player for a long lasting game.
Another example for the case of quantum control (without games) was given in [28].

4. Controlled Limiting Stochastic Equation

Let X be a Borel space with a fixed Borel measure that we denote d x .
For a linear operator O in L 2 ( X ) we shall denote by O j the operator in L 2 ( X N ) that acts on functions f ( x 1 , , x N ) as O acting on the variable x j . For a linear operator A in L 2 ( X 2 ) we shall denote by A i j the operator in L 2 ( X N ) that acts as A on the variables x i , x j .
Let H and H ^ be two self-adjoint operators in L 2 ( X ) and A a self-adjoint integral operator in L 2 ( X 2 ) with the kernel A ( x , y ; x , y ) that acts on the functions of two variables as
A ψ ( x , y ) = X 2 A ( x , y ; x , y ) ψ ( x , y ) d x d y .
It is assumed that A is symmetric in the sense that it takes symmetric functions ψ ( x , y ) (symmetric with respect to permutation of x and y) to symmetric functions.
Let us consider the quantum evolution of N particles driven by the interaction Hamiltonian
H u ( N ) f ( x 1 , , x N ) = j = 1 N ( H j + u j ( t , Γ N ( j ) ) H ^ j ) f ( x 1 , , x N ) + 1 N i < j N A i j f ( x 1 , , x N ) .
Here continuous functions u j ( t , γ ) describe the controls of jth agent, who is supposed to have access to the jth subsystem, namely to the partial trace Γ N , t ( j ) (with respect to all other variables but j) of the state Γ N , t . All u j are taken from a bounded interval [ U , U ] .
In order to be able to carry out a feedback control we assume further that this quantum system is observed via coupling with the collection of (possibly controlled) identical one-particle unitary families L ( v ) . That is, we consider the filtering Equation (3) of the type
d Ψ N , t = i H u ( N ) Ψ N , t d t + j = 1 N ( L j ( v j ( t , Γ N ( j ) ) ) 1 ) Ψ N , t d N t j .
The corresponding density matrix Γ N , t = Ψ N , t Ψ N , t ¯ satisfies the equation of type (5):
d Γ N , t = i [ H u ( N ) , Γ N , t ] d t + j ( L j ( v j ( t , Γ N ( j ) ) ) Γ N , t L j ( v j ( t , Γ N ( j ) ) ) Γ N , t ) d N t j .
The main ingredient in the construction of quantum MFG theory is the quantum law of large numbers that states that as N , the limiting evolution of each particle (precise conditions are given in the theorem below) is described by the nonlinear stochastic equation
d ψ j , t = i [ H + u ( t , γ j , t ) H ^ + A η ¯ t ] ψ j , t d t + ( L ( v j ( t , γ j , t ) ) ψ j , t ψ j , t ) d N t j ,
where A η ¯ t is the integral operator in L 2 ( X ) with the integral kernel
A η ¯ t ( x ; y ) = X 2 A ( x , y ; x , y ) η t ( y , y ) ¯ d y d y
and
η t ( y , z ) = E ( ψ j , t ( y ) ψ ¯ j , t ( z ) ) .
The equation for the corresponding density matrix γ j , t = ψ j , t ψ ¯ j , t writes down as
d γ j , t = i [ H + u ( t , γ j , t ) H ^ , γ j , t ] d t i [ A η ¯ t , γ j , t ] d t + ( L ( v j ( t , γ j , t ) ) γ j , t L γ j , t ) d N t j , η t ( y , z ) = E ( ψ j , t ( y ) ψ ¯ j , t ( z ) ) = E γ j , t ( y , z ) .
For the analysis of the limiting behavior we use an approach from [41,42], where the main measures of the deviation of the solutions Ψ N , t to N-particle systems from the product of the solutions ψ t to the Hartree equations are the following positive numbers from the interval [ 0 , 1 ] :
α N ( t ) = 1 ( ψ t , Γ N , t ψ t ) .
In the present stochastic case, these quantities depend not just on the number of particles in the product, but on the concrete choice of these particles. The proper stochastic analog of the quantity α N ( t ) is the collection of random variables
α N , j ( t ) = 1 ( ψ j , t , Γ N , t ψ j , t ) = 1 tr ( γ j , t Γ N , t ) = 1 tr ( γ j , t Γ N , t ( j ) ) ,
where the latter equation holds by the definition of the partial trace. Here γ j , t is identified with the operator in L 2 ( X N ) acting on the jth variable and Γ N , t ( j ) denotes the partial trace of Γ N , t with respect to all variables except for the jth.
Since evolutions (20) preserve the set of operators with the unit trace, (23) rewrites as
α N , j ( t ) = tr ( ( 1 γ j , t ) Γ N , t ) = tr ( ( 1 γ j , t ) Γ N , t ( j ) ) .
Assuming that all controls u j and v j are given by identical feedback functions u ( t , γ ) , v ( t , γ ) and that the initial conditions for Equation (19) is the tensor product of i.i.d. random vectors, the expectations E α N ( t ) = E α N , j ( t ) are well defined (they do not depend on a particular choice of particles).
Expressions α N , j can be linked with the traces by the following inequalities, due to Knowles and Pickl:
α N , j ( t ) tr | Γ N , t ( j ) γ j , t | 2 2 α N , j ( t ) ,
see Lemma 2.3 from [42].
Theorem 1.
Let H , H ^ be self-adjoint operators in L 2 ( X ) , with H ^ bounded, and L ( v ) be a family of unitary operators depending Lipschitz continuously on v:
L ( v 1 ) L ( v 2 ) ϰ L | v 1 v 2 | .
Let A be a symmetric self-adjoint integral operator A in L 2 ( X 2 ) with a Hilbert-Schmidt kernel, that is a kernel A ( x , y ; x , y ) such that
A H S 2 = X 4 | A ( x , y ; x , y ) | 2 d x d y d x d y < ,
A ( x , y ; x y ) = A ( y , x ; y , x ) , A ( x , y ; x , y ) = A ( x , y ; x , y ) ¯ .
Let the functions u ( t , γ ) and v ( t , γ ) with values in bounded intervals [ U , U ] and [ V , V ] respectively be Lipschitz in the sense that
| u ( t , γ ) u ( t , γ ˜ ) | ϰ tr | γ γ ˜ | , | v ( t , γ ) v ( t , γ ˜ ) | ϰ tr | γ γ ˜ | .
Let ψ j , t be solutions to Equation (21) with i.i.d. initial conditions ψ j , 0 , ψ j , 0 = 1 . Let Ψ N , t be the solution to the N-particle Equation (19) with H u ( N ) given by (18) and with the initial condition
Ψ N , 0 ( x 1 , , x N ) = ψ j , 0 ( x j ) .
Then
E α N ( t ) ( exp { ( 7 A H S + 12 ϰ ( H ^ + ϰ L + ϰ L 2 ϰ ) ) t } 1 ) 1 N .
Proof. 
By Ito’s product rule for counting processes,
d α N , j ( t ) = tr ( d Γ N , t γ j , t ) tr ( Γ N , t d γ j , t ) tr ( d Γ N , t d γ j , t ) ,
with the Ito product rule being d N t j d N t i = δ i j d N t j .
Let us denote by I and II the parts of the differential d α N , j ( t ) that contain L j and, respectively, not.
Starting with II we obtain, denoting A j η ¯ t the operator A η ¯ t acting on the jth variable, that
I I = i tr ( [ H j + u j ( t , γ j , t ) H ^ j + A j η ¯ t , γ j , t ] Γ N , t ) d t + i tr ( γ j , t [ H ( N ) , Γ N , t ] ) d t = i tr ( [ H j + u j ( t , γ j , t ) H ^ j + A j η ¯ t , γ j , t ] Γ N , t ) d t + i tr ( [ γ j , t , H ( N ) ] Γ N , t ) d t = i tr ( [ H j + u j ( t , γ j , t ) H ^ j + A j η ¯ t , q j , t ] Γ N , t ) d t + i tr ( [ H ( N ) , q j , t ] Γ N , t ) d t = i tr ( [ H ( N ) H j u j ( t , γ j , t ) H ^ j A j η ¯ t , q j , t ] Γ N , t ) d t = I I 1 + I I 2 ,
with
I I 1 = i tr ( [ 1 N m j A m j A j η ¯ t , q j , t ] Γ N , t ) d t
and
I I 2 = i tr ( [ ( u j ( t , Γ N , t ( j ) ) u j ( t , γ j , t ) ) H ^ j , q j , t ] Γ N , t ) d t = B j , t d t ,
with
| B j , t | 2 | u j ( t , Γ N , t ( j ) ) u j ( t , γ j , t ) | H ^ q j , t Ψ N , t
2 ϰ tr | Γ N , t ( j ) γ j , t | H ^ α N , j ( t )
4 ϰ 2 α N , j ( t ) H ^ α N , j ( t ) = 4 2 ϰ H ^ α N , j ( t ) ,
where for the last inequality we used (25).
The term I I 1 was dealt with in [1] (proof of Theorems 3.1) yielding the estimate
E | I I 1 | 7 A H S E α N ( t ) + 1 N .
Let us turn to I. We have
I = tr [ ( L ( v j ( t , γ j , t ) ) γ j , t L ( v j ( t , γ j , t ) ) γ j , t ) Γ N , t ] d N t j , tr k [ ( L k ( v k ( t , Γ N , t ( k ) ) ) Γ N , t L k ( v k ( t , Γ N , t ( k ) ) ) Γ N , t ) γ j , t ] d N t k tr [ ( L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t ) ( L j ( v j ( t , γ j , t ) ) γ j , t L j ( v j ( t , γ j , t ) ) γ j , t ) ] d N t j .
Since γ j , t and L k with k j commute, it follows that all terms with k j cancel. Taking into account other cancelation (arising from the unitarity of L j ) we obtain
I = tr [ Γ N , t γ j , t L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t L j ( v j ( t , Γ N , t ( j ) ) ) L j ( v j ( t , γ j , t ) ) γ j , t L j ( v j ( t , γ j , t ) ) ] d N t j .
If L j would be constant, this expression would vanish. In the present controlled version, some work is required. First of all, writing γ j , t = 1 q j , t we obtain
I = C j , t d N t j = C j , t ( d M t j + d t )
with
C j , t = tr [ Γ N , t q j , t L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t L j ( v j ( t , Γ N , t ( j ) ) ) L j ( v j ( t , γ j , t ) ) q j , t L j ( v j ( t , γ j , t ) ) ] .
To make the calculations more transparent, let us omit indices at v , γ , q , Γ . Thus
C j , t = tr [ Γ q L ( v ( t , Γ N , t ( j ) ) ) Γ L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] = tr [ L ( v ( t , γ ) ) Γ L ( v ( t , γ ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] tr [ L ( v ( t , Γ N , t ( j ) ) ) Γ L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] = tr [ L ( v ( t , γ ) ) Γ L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) Γ L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] = C j , t 1 + C j , t 2 ,
where
C j , t 1 = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) ) Γ L ( v ( t , γ ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) ) Γ q L ( v ( t , γ ) ) ] = tr [ Γ q L ( v ( t , γ ) ) ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) ) ] , C j , t 2 = tr [ L ( v ( t , Γ N , t ( j ) ) Γ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ] .
We can now estimate C j , t 1 as I I 2 above yielding
| C j , t 1 | | Ψ N , t q , L ( v ( t , γ ) ) ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) ) Ψ N , t | q Ψ N , t L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) α N , j ( t ) ϰ L ϰ tr | γ Γ N , t ( j ) | 2 2 ϰ L ϰ α N , j ( t ) .
With C j , t 2 yet another add-and-subtract manipulation is required. Namely,
C j , t 2 = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) Γ ] = C j , t 21 + C j , t 22
with
C j , t 21 = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) L ( v ( t , γ ) Γ ] = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q Γ ] , C j , t 22 = tr [ ( L ( v ( t , γ ) ) L ( v ( t , Γ N , t ( j ) ) ) L ( v ( t , γ ) ) q L ( v ( t , γ ) ) ( L ( v ( t , Γ N , t ( j ) ) L ( v ( t , γ ) ) ) Γ ] .
The first term is estimated as above yielding
| C j , t 21 | 2 2 ϰ L ϰ α N , j ( t ) .
And the second one is estimated as
| C j , t 22 | ϰ L 2 | v ( t , Γ N , t ( j ) ) v ( t , γ ) | 2 8 ϰ L 2 ϰ 2 α N , j ( t ) .
Thus
d α N , j ( t ) = I I 1 + ( B j , t + C j , t ) d t + C j , t d M t j .
Therefore, since M t j is a martingale and its differential does not contribute to the expectation, it follows that
d E α N ( t ) 7 A H S E α N ( t ) + 1 N d t + ( 4 2 ϰ H ^ + 8 2 ϰ L ϰ + 8 ϰ L 2 ϰ 2 ) E α N ( t ) d t .
Applying Gronwall’s inequality yields (30). □

5. Quantum MFG

Let us consider the quantum dynamic game of N players, where the dynamics of the density matrix Γ N , t is given by the controlled dynamics of type (20):
d Γ N , t = i j [ H j + u j ( t , Γ N , t ( j ) ) H ^ j , Γ N , t ] i N l < j N [ A l j , Γ N , t ] + j ( L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t L j ( v j ( t , Γ N , t ( j ) ) ) Γ N , t ) d N t j .
Assume as above that controls u j and v j of each jth player can be chosen from some bounded closed intervals [ U , U ] and [ V , V ] respectively, that the initial matrix is the product of iid states,
Γ N , 0 ( x 1 , , x n ; y 1 , , y N ) = j = 1 N ψ j ( x j ) ψ j ( y j ) ¯ ,
and that the payoff of each player on the interval [ t , T ] is given by the expression
P j ( t , W ; u ( . ) ) = t T tr ( J j Γ N , s ) c 2 u j 2 ( s ) d s + tr ( F j Γ N , T ) ,
where J and F are some operators in L 2 ( X ) expressing the current and the terminal costs of the agent, J j and F j denote their actions on the jth variable, constants c 0 measure the cost of applying control u.
Remark 1. 
(i) We choose the simplest payoff function. Of course more general dependence on u, v is possible. As long as payoff is convex in u and v the results below are still valid. (ii) Also everything remains in force if only H or only L is controlled, that is either u or v is not present in all formulas.
Notice that by the property of the partial trace, the payoff (34) rewrites as
P j ( t , W ; u ( . ) ) = t T tr ( J j Γ N , s ( j ) ) c 2 u j 2 ( s ) d s + tr ( F j Γ N , T ( j ) ) ,
so that it really depends explicitly only on the individual partial traces Γ N , t ( j ) , which can be considered as quantum analogs of the positions of classical particles.
Let us stress again that, after all equations arising from physics are written, our quantum dynamic N-player game can be formulated in fully classical terms. Namely, the goal of each jth player is to maximise the expectation of payoff (35) under the evolution (33) depending on all controls u = ( u j ) . The information available to the jth player is the ‘position’ of jth player, which is the partial trace Γ N , t ( j ) , and thus the actions of jth player are chosen among the feedback strategies u j that are measurable functions u j ( t , Γ N , t ( j ) ) . An additional technical assumption that we are using in the analysis below is that the class of feedback strategies is reduced to Lipschitz continuous functions of partial traces. Therefore both the information setting and technical assumptions are slightly different from the simpler setting of two-player game of Section 3, where players were assumed to define their strategies on the basis of the whole state (not a partial trace). The restriction to partial traces is necessary to uncouple the dynamics in the limit of N .
The limiting evolution of each player can be expected to be described by the equations
d γ j , t = i [ H + u j ( t , γ j , t ) H ^ , γ j , t ] d t i [ A η t ¯ , γ j , t ] d t + ( L ( v j ( t , γ j , t ) ) γ j , t L ( v j ( t , γ j , t ) ) γ j , t ) d N t j ,
with
η t ( x , y ) = lim N 1 N j = 1 N γ j , t ( x , y ) ,
and with payoffs given by
P j ( t , W ; u ( . ) ) = t T tr ( J γ j , s ) c 2 u j 2 ( s ) d s + tr ( F γ j , T ) .
For pure states γ j , t = ψ j , t ψ ¯ j , t this payoff turns to
P j ( t , W ; u ( . ) ) = t T ( ψ j , t , J ψ j , t ) c 2 u j 2 ( s ) d s + ( ψ j , T , F ψ j , T ) .
Let us say that the pair of functions ( u , v ) t M F G ( γ ) = ( u , v ) M F G ( t , γ ) with t [ 0 , T ] and γ from the set of density matrices in L 2 ( X ) , and η t M F G ( x , y ) with x , y X , t [ 0 , T ] , solve the limiting MFG problem if (i) ( u , v ) t ( γ ) is an optimal feedback strategy for the stochastic control problem (36), (37) under the fixed function η t = η t M F G and (ii) η t M F G arises from the solution of (36) under fixed ( u , v ) t = ( u , v ) t M F G .
Theorem 2.
Let the conditions on H , L , A from Theorem 1 hold. Assume that the pair ( u , v ) t M F G ( γ ) and η t M F G ( x , y ) solves the limiting MFG problem and moreover u t M F G is Lipschitz in the sense of inequality (29). Then the strategies
( u , v ) j ( t , Γ N t ) = ( u , v ) t M F G ( Γ N , t ( j ) ) ,
form a symmetric ϵ-Nash equilibrium for the N-agent quantum game described by (33) and (34), where strategies of all players are sought among measurable controls ( u , v ) ( t , γ ) that depend Lipschitz in γ in the sense of inequality (29), with ϵ = C ( T ) N 1 / 4 , C ( T ) depending on A H S , H ^ , ϰ, ϰ L .
Proof. 
Assume that all players, except for one of them, say the first one, are playing according to the MFG strategy ( u , v ) M F G ( t , Γ N , t ( j ) ) , j > 1 , and the first player is following some other strategy ( u ˜ , v ˜ ) ( t , Γ N , t ( 1 ) ) . By the law of large numbers (which is not affected by a single deviation), all η t j are equal and are given by the formula η t = E γ j , t for all j > 1 . Moreover, E α N , j ( t ) = E α N ( t ) are the same for all j > 1 .
Following the proof of Theorem 1 we obtain
α ˙ N , j ( t ) = I + I I 1 + I I 2
with the same I , I I 1 , I I 2 , as in the proof of Theorem 1, though ( u , v ) being ( u , v ) M F G ( t , Γ N , t ( j ) ) , j > 1 , and ( u ˜ , v ˜ ) ( t , Γ N , t ( 1 ) ) for j = 1 . Looking first at j > 1 we note that up to an additive correction of magnitude not exceeding 4 A H S / N expression I I 1 can be substituted by the expression
i tr ( [ 1 N m j , 1 A m j A j η ¯ t , q j , t ] γ N , t ) ,
which is then dealt with exactly as in the proof of Theorem 1 (with N 1 instead of N) yielding the same estimate (30) (with a corrected multiplier) for E α N ( t ) = E α N , j ( t ) , j > 1 , that is
E α N ( t ) ( exp { ( 7 A H S + 12 ϰ ( H ^ + ϰ L + ϰ L 2 ϰ ) ) t } 1 ) 1 N ( 1 + 4 A H S ) .
The same estimate is obtained for E α N , 1 ( t ) (even without the correcting term 4 A H S ) yielding
E α N , j ( t ) C ( T ) N 1 / 2
for all j and a constant C ( T ) depending on A H S , ϰ , ϰ L , H ^ .
We can now compare the expected payoffs (35) received by the players in the N-player quantum game with the expected payoff (37) received in the limiting game. For each jth player the difference is bounded by
E t T | tr ( J ( Γ N , s ( j ) γ j , s ) ) | d s + E | tr ( F ( Γ N , T ( j ) γ j , T ) ) | .
Since,
| tr ( J ( Γ N , s ( j ) γ j , s ) ) | J tr | Γ N , s ( j ) γ j , s | ,
and by (25),
tr | Γ N , s ( j ) γ j , s | 2 2 α N , j ( s ) ,
it follows that the expectation of the difference of the payoffs is bounded by
2 2 ( J T + F ) sup t E α N , j ( t )
2 2 ( J T + F ) sup t E α N , j ( t ) ( J T + F ) C ( T ) N 1 / 4 ,
with a constant C ( T ) depending on A H S , ϰ , ϰ L , H ^ .
But by the assumption of the Theorem, ( u , v ) t M F G is the optimal choice for the limiting optimization problem. Hence the claim of the theorem follows. □

6. Discussion

The problem of proving existence or uniqueness for the solution of the limiting MFG on manifold seems to be nontrivial. We suggest it as an interesting open problem.
Let us give a bit more detail for the simplest case of two-dimensional Hilbert space (a qubit), as in Section 3.
When there is no control v (that is, operator L is constant) and there is no free (uncontrolled) part of the Hamiltonian, the limiting Equation (21) simplify to the equation (omitting indices j and t for simplicity)
d ψ = i [ u H ^ + A η ¯ ] ψ d t + ( L ψ ψ ) d N t .
Moreover, ψ has only two coordinates: ψ = ( ψ 0 , ψ 1 ) . Using Ito’s rule as in Section 3, we find the equation for w = ψ 1 / ψ 0 :
d w = i [ u ( w ( H ^ W ) 0 ( H ^ W ) 1 ) + w ( A η ¯ W ) 0 ( A η ¯ W ) 1 ] d t + ( ( L W ) 1 w ( L W ) 0 ) d N t ,
where W = ( w 0 , w 1 ) = ( 1 , w ) .
The most common interaction operator between qubits is the operator describing the possible exchange of photons, A = a 1 a 2 + a 2 a 1 , with the annihilation operators a 1 and a 2 of the two atoms. This interaction is given by the tensor A ( j , k ; m , n ) such that A ( 1 , 0 ; 0 , 1 ) = A ( 0 , 1 ; 1 , 0 ) = 1 with other elements vanishing. Hence A 10 η = η 01 , A 01 η = η 10 , with other elements vanishing. Let us take also the simplest possible L: L = σ 3 —the third Pauli matrix. Then Equation (42) simplifies to
d w = i [ u ( w ( H ^ W ) 0 ( H ^ W ) 1 ) + η ¯ 10 w 2 η ¯ 01 ] d t 2 w d N t .
If H ^ is diagonal with diagonal elements h 0 , h 1 , this turns to
d w = i [ u w ( h 0 h 1 ) + η ¯ 10 w 2 η ¯ 01 ] d t 2 w d N t .
In this simplest case, choosing h 0 h 1 = 1 and c = 0 in payoff (38), we obtain the HJB equation for the individual control in the form
S t + max u u x S y u y S x + ( W , J W ) 1 + | w | 2 + S y R e ( η ¯ 10 w 2 η ¯ 01 ) ] S x I m ( η ¯ 10 w 2 η ¯ 01 ) ] + ( S ( x , y ) S ( x , y ) ) = 0 .
Already this equation on the complex plane C , describing optimal control for the individual quantum feedback control in a qubit, is quite nonstandard. And to deal with the corresponding forward-backward system one needs not only its well-posedness in a certain generalized sense, but some continuous dependence on parameters. May be some method from [43] or [28] can be used to get insight into this problem.
As a future research direction it is worth mentioning the general development of the theory of the limiting classical mean-field games, which are mean-field games on infinite dimensional curvilinear manifolds based on Markov processes with jumps, highly fascinating and nontrivial objects. Of course usual questions of classical mean-field games on the connection between stationary and time dependent solutions are fully open here, as well as the theory of the corresponding master equation. On the other hand, quantum dynamic games of finite number of players (touched upon in Section 3) lead to new nonlinear functional-differential equations on manifolds of Hamilton-Jacobi or Isaacs type, which are also worthy of proper analysis.

Funding

The author gratefully acknowledges the funding by the Russian Academic Excellence project ‘5–100’.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Kolokoltsov, V.N. Quantum Mean Field Games. arXiv 2020, arXiv:2005.02350. [Google Scholar]
  2. Huang, M.; Malhamé, R.; Caines, P. Large population stochastic dynamic games: Closed-loop Mckean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 2006, 6, 221–252. [Google Scholar]
  3. Lasry, J.-M.; Lions, P.-L. Jeux à champ moyen, I. Le cas stationnaire. C. R. Math. Acad. Sci. Paris 2006, 343, 619–625. [Google Scholar] [CrossRef]
  4. Bensoussan, A.; Frehse, J.; Yam, P. Mean Field Games and Mean Field Type Control Theory; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  5. Carmona, R.; Delarue, F. Probabilistic Theory of Mean Field Games with Applications, V. I, II. Probability Theory and Stochastic Modelling; Springer: Berlin/Heidelberg, Germany, 2018; Volume 83, p. 84. [Google Scholar]
  6. Gomes, D.A.; Pimentel, E.A.; Voskanyan, V. Regularity Theory for Mean-Field Game Systems; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  7. Kolokoltsov, V.N.; Malafeyev, O.A. Many Agent Games in Socio-Economic Systems: Corruption, Inspection, Coalition Building, Network Growth, Security; Springer Series in Operations Research and Financial Engineering; Springer Nature: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
  8. Meyer, D.A. Quantum strategies. Phys. Rev. Lett. 1999, 82, 1052–1055. [Google Scholar] [CrossRef] [Green Version]
  9. Eisert, J.; Wilkens, M.; Lewenstein, M. Quantum Games and Quantum Strategies. Phys. Rev. Lett. 1999, 83, 3077–3080. [Google Scholar] [CrossRef] [Green Version]
  10. Marinatto, L.; Weber, T. A quantum approach to static games of complete information. Phys. Lett. A 2000, 272, 291–303. [Google Scholar] [CrossRef] [Green Version]
  11. Khan, F.S.; Solmeyer, N.; Balu, R.; Humble, T. Quantum games: A review of the history, current state, and interpretation. Quantum Inf. Process. 2018, 17, 309. [Google Scholar] [CrossRef] [Green Version]
  12. Guo, H.; Zhang, J.; Koehler, G.J. A survey of quantum games. Decis. Support Syst. 2008, 46, 318–332. [Google Scholar] [CrossRef]
  13. Kolokoltsov, V.N.; Malafeyev, O.A. Understanding Game Theory, 2nd ed.; World Scientific: Singapore, 2019. [Google Scholar]
  14. Iqbal, A.; Toor, A.H. Equilibria of Replicator Dynamics in Quantum Games. arXiv 2001, arXiv:quant-ph/0106135. [Google Scholar]
  15. Iqbal, A.; Toor, A.H. Quantum mechanics gives stability to a Nash equilibrium. Phys. Rev. A 2002, 65, 022306. [Google Scholar] [CrossRef] [Green Version]
  16. Iqbal, A.; Toor, A.H. Darwinism in quantum systems? Phys. Lett. A 2002, 294, 261–270. [Google Scholar] [CrossRef] [Green Version]
  17. Iqbal, A.; Toor, A.H. Quantum cooperative games. Phys. Lett. A 2002, 293, 103–108. [Google Scholar] [CrossRef] [Green Version]
  18. Du, J.; Li, H.; Xu, X.; Shi, M.; Zhou, X.; Han, R. Nash equilibrium in the Quantum Battle of the Sexes Game. arXiv 2001, arXiv:quant-ph/0010050v3. [Google Scholar]
  19. Du, J.; Li, H.; Xu, X.; Zhou, X.; Han, R. Phase-transition-like Behavior of Quantum Games. arXiv 2003, arXiv:quant-ph/0111138v4. [Google Scholar]
  20. Du, J.; Li, H.; Xu, X.; Zhou, X.; Han, R. Entanglement Enhanced Multiplayer Quantum Games. Phys. Lett. A 2002, 302, 229–233. [Google Scholar] [CrossRef] [Green Version]
  21. Li, H.; Du, J.; Massar, S. Continuous variable quantum games. Phys. Lett. A 2002, 306, 73–78. [Google Scholar] [CrossRef] [Green Version]
  22. Pothos, E.; Busemeyer, J. Can quantum probability provide a new direction for cognitive modeling? Behav. Brain Sci. 2013, 36, 255–274. [Google Scholar] [CrossRef] [Green Version]
  23. Khrennikov, A.Y. Ubiquitous Quantum Structure: From Psychology to Finance; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  24. Aoki, S.; Ikeda, K. Repeated Quantum Games and Strategic Efficiency. Available online: https://arxiv.org/abs/2005.05588 (accessed on 6 January 2021).
  25. Ikeda, K. Foundation of quantum optimal transport and applications. Quantum Inf. Process. 2020, 19, 25. [Google Scholar] [CrossRef] [Green Version]
  26. Aoki, S.; Ikeda, K. Theory of Quantum Games and Quantum Economic Behavior. Available online: https://arxiv.org/abs/2010.14098 (accessed on 6 January 2021).
  27. Bouten, L.; Handel, R.V. On the separation principle of quantum control. arXiv 2006, arXiv:math-ph/0511021v2. [Google Scholar]
  28. Kolokoltsov, V.N. The stochastic Bellman equation as a nonlinear equation in Maslov spaces. Perturbation theory. Dokl. Akad. Nauk 1992, 323, 223–228. [Google Scholar]
  29. Belavkin, V.P. Nondemolition measurement and control in quantum dynamical systems. In Information Complexity and Control in Quantum Physics; Diner, S., Lochak, G., Eds.; CISM Courses and Lectures; Springer: Vienna, Austria, 1987; Volume 294, pp. 331–336. [Google Scholar]
  30. Belavkin, V.P. Nondemolition stochastic calculus in Fock space and nonlinear filtering and control in quantum systems. In Proceedings of the XXIV Karpacz Winter School Stochastic Methods in Mathematics and Physics, Karpacz, Poland, 13–27 January 1988; Guelerak, R., Karwowski, W., Eds.; World Scientific: Singapore, 1988; pp. 310–324. [Google Scholar]
  31. Belavkin, V.P. Quantum stochastic calculus and quantum nonlinear filtering. J. Multivar. Anal. 1992, 42, 171–201. [Google Scholar] [CrossRef] [Green Version]
  32. Belavkin, V.P.; Kolokoltsov, V.N. Stochastic evolution as interaction representation of a boundary value problem for Dirac type equation. Infin. Dimens. Anal. Probab. Relat. Fields 2002, 5, 61–92. [Google Scholar] [CrossRef] [Green Version]
  33. Pellegrini, C. Poisson and Diffusion Approximation of Stochastic Schrödinger Equations with Control. Ann. Henri Poincaré 2009, 10, 995–1025. [Google Scholar] [CrossRef] [Green Version]
  34. Barchielli, A.; Belavkin, V.P. Measurements contunuous in time and a posteriori states in quantum mechanics. J. Phys. A Math. Gen. 1991, 24, 1495–1514. [Google Scholar] [CrossRef] [Green Version]
  35. Holevo, A.S. Statistical Inference for quantum processes. In Quanum Aspects of Optical Communications; Springer: Berlin, Germany, 1991; Volume 378, pp. 127–137. [Google Scholar]
  36. Kolokoltsov, V.N. Continuous time random walks modeling of quantum measurement and fractional equations of quantum stochastic filtering and control. arXiv 2020, arXiv:2008.07355. [Google Scholar]
  37. Armen, M.A.; Au, J.K.; Stockton, J.K.; Doherty, A.C.; Mabuchi, H. Adaptive homodyne measurement of optical phase. Phys. Rev. Lett. 2002, 89, 133602. [Google Scholar] [CrossRef] [Green Version]
  38. Bushev, P.; Rotter, D.; Wilson, A.; Dubin, F.; Becher, C.; Eschner, J.; Blatt, R.; Steixner, V.; Rabl, P.; Zoller, P. Feedback cooling of a singe trapped ion. Phys. Rev. Lett. 2006, 96, 043003. [Google Scholar] [CrossRef] [Green Version]
  39. Wiseman, H.M.; Milburn, G.J. Quantum Measurement and Control; Cambridge Univesity Press: Cambridge, UK, 2010. [Google Scholar]
  40. Bouten, L.; Handel, R.V.; James, M. An introduction to quantum filtering. SIAM J. Control Optim. 2007, 46, 2199–2241. [Google Scholar] [CrossRef]
  41. Pickl, P. A simple derivation of mean-field limits for quantum systems. Lett. Math. Phys. 2011, 97, 151–164. [Google Scholar] [CrossRef]
  42. Knowles, A.; Pickl, P. Mean-field dynamics: Singular potentials and rate of convergence. Commun. Math. Phys. 2010, 298, 101–138. [Google Scholar] [CrossRef] [Green Version]
  43. Averboukh, Y. Viability analysis of the first-order mean field games. ESAIM Control Optim. Calc. Var. 2020, 26, 33. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kolokoltsov, V.N. Quantum Mean-Field Games with the Observations of Counting Type. Games 2021, 12, 7. https://0-doi-org.brum.beds.ac.uk/10.3390/g12010007

AMA Style

Kolokoltsov VN. Quantum Mean-Field Games with the Observations of Counting Type. Games. 2021; 12(1):7. https://0-doi-org.brum.beds.ac.uk/10.3390/g12010007

Chicago/Turabian Style

Kolokoltsov, Vassili N. 2021. "Quantum Mean-Field Games with the Observations of Counting Type" Games 12, no. 1: 7. https://0-doi-org.brum.beds.ac.uk/10.3390/g12010007

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop