Strategic Mitigation of Agent Inattention in Drivers with Open-Quantum Cognition Models

Qizi Zhang1, Venkata Sriram Siddhardh Nadendla2, S. N. Balakrishnan1, Jerome Busemeyer3

1Dept. of Mechanical and Aerospace Engineering and 2Dept. of Computer Science
Missouri Univ. of Science and Technology
Rolla, Missouri 65401.
Email: {qzwtb, nadendla, bala}@mst.edu 3Dept. of Psychological and Brain Sciences,
Indiana University Bloomington
Bloomington, IN 47405.
Email: jbusemey@indiana.edu

Abstract

State-of-the-art driver-assist systems have failed to effectively mitigate driver inattention and had minimal impacts on the ever-growing number of road mishaps (e.g. life loss, physical injuries due to accidents caused by various factors that lead to driver inattention). This is because traditional human-machine interaction settings are modeled in classical and behavioral game-theoretic domains which are technically appropriate to characterize strategic interaction between either two utility maximizing agents, or human decision makers. Therefore, in an attempt to improve the persuasive effectiveness of driver-assist systems, we develop a novel strategic and personalized driver-assist system which adapts to the driver’s mental state and choice behavior. First, we propose a novel equilibrium notion in human-system interaction games, where the system maximizes its expected utility and human decisions can be characterized using any general decision model. Then we use this novel equilibrium notion to investigate the strategic driver-vehicle interaction game where the car presents a persuasive recommendation to steer the driver towards safer driving decisions. We assume that the driver employs an open-quantum system cognition model, which captures complex aspects of human decision making such as violations to classical law of total probability and incompatibility of certain mental representations of information. We present closed-form expressions for players’ final responses to each other’s strategies so that we can numerically compute both pure and mixed equilibria. Numerical results are presented to illustrate both kinds of equilibria.

Index Terms:

Game theory, open quantum system model, mixed-strategy equilibrium, quantum cognition.

I Introduction

Driver inattention is a dangerous phenomenon that can arise because of various reasons: distractions, drowsiness due to fatigue, less reaction time due to speeding, and intoxication. The consequences of inattentive driving can severely affect the driver’s safety even under normal road conditions, and can be devastating in terms of life-loss and/or long-lasting injuries. According to NHTSA’s latest revelations [1], in 2019, 3142 lives were claimed by distracted driving, 795 lives were claimed by drowsy driving, 9378 deaths were due to speeding, and 10,142 deaths were due to drunk driving, all in the United States alone. Therefore, several types of driver-assist systems have been developed and deployed in modern vehicles to mitigate inattentiveness. However, traditional driver-assist technologies are static and not personalized, which are insufficient to handle the situations in futuristic transportation systems with mostly connected and/or autonomous vehicles. For example, several deadly accidents have been reported where the Tesla driving assistants were working normally but the drivers were inattentive [2, 3]. As per SAE standard J3016 [4], the state-of-the-art vehicles mostly fall under Levels 2/3, which continue to demand significant driver attention (e.g. Tesla autopilot [5]), especially in uncertain road and weather conditions. Therefore, there is a strong need to design dynamic, data-driven driver-alert systems which present effective interventions in a strategic manner based on its estimates of the driver’s attention level and physical conditions.

However, the design of strategic interventions to mitigate the ill effects of driver inattention is quite challenging due to three fundamental reasons. Firstly, the driver may not follow the vehicle’s recommendations (i) if the driver is inattentive, (ii) if the driver does not trust the vehicle’s recommendations, and/or (iii) if the recommendation signal is not accurate enough to steer driver’s choices (e.g. the driver may not stop the vehicle because of a false alarm). Secondly, the persuasive effectiveness of vehicle’s recommendations is technically difficult to evaluate due to its complex/unknown relationship with the driver’s (i) attention level [6], (ii) own judgment/prior of road conditions [7], and (iii) trust on the vehicle’s recommendation system [8]. In addition, it is difficult to mathematically model and estimate these three terms [9, 10, 11]. Finally, there is strong evidence within the psychology literature that human decisions exhibit several anomalies to traditional decision theory. Examples include deviations from expected utility maximization such as Allais paradox [12], Ellsberg paradox [13], violations of transitivity and/or independence between alternatives [14]; and deviations from classical Kolmogorov probability theory such as conjunction fallacy [15], disjunction fallacy [16], and violation of sure thing principle [17].

There have been a few relevant efforts in the recent literature where both the driver and the driver-assist system interact in a game theoretic setting. These efforts can be broadly classified into two types: (i) the direct method where the system uses its on-board AI to directly control the vehicle, and (ii) the indirect method where the system indirectly controls the vehicle via relying on the driver to make decisions. On the one hand, Flad et al. proposed a direct method that models driver steering motion as a sequence of motion primitives so that the aims and steering actions of the driver can be predicted and then the optimal torque can be calculated [18]. Another example that proposes a direct method is by Na and Cole in [19], where four different paradigms were investigated: (i) decentralized, (ii) non-cooperative Nash, (iii) non-cooperative Stackelberg, and (iv) cooperative Pareto, to determine the most effective method to model driver reactions in collision avoidance systems. Although direct methods can mimic driver actions, they certainly do not consider the driver’s cognition state (in terms of preferences, biases and attention) and no intervention was designed/implemented to mitigate inattention. On the other hand, indirect methods have bridged this gap via considering driver’s cognition state into account. Lutes et al. modeled driver-vehicle interaction as a Bayesian Stackelberg game, where the on-board AI in the vehicle (leader) presents binary signals (no-alert/alert) based on which the driver (follower) makes a binary decision (continue/stop) regarding controlling the vehicle on a road [20]. This work and [20] share the same setting of unknown road condition and binary actions of two players, and also introduce a non-negative exponent parameter in the overall driver’s utility to capture his/her level of attention. The difference is that [20] still follows the traditional game theory framework of maximizing payoffs while this work extends the traditional framework in which the players do not necessarily maximize payoffs. Schwarting et al. integrated Social Value Orientation (SVO) into autonomous-vehicle decision making. Their model quantifies the degree of an agent’s selfishness or altruism in order to predict the social behavior of other drivers. They modeled interactions between agents as a best-response game wherein each agent negotiates to maximize their own utilities [21]. However, all the human players in the game of the above research are still assumed to be rational players maximizing utilities, even though the utilities are modified to capture attention level or social behavior, whether by a non-negative exponent parameter or by SVO. The present work bridges this gap by directly considering the driver as an agent who does not seek to maximize payoff, but instead uses a quantum-cognition based decision process to make decisions.

Note that most of the past literature focused on addressing each of these challenges independently. The main contribution of this paper is that we address all the three challenges jointly in our driver-vehicle interaction setting. In Section II, we propose a novel strategic driver-vehicle interaction framework where all the aforementioned challenges are simultaneously addressed in a novel game-theoretic setting. We assume that the vehicle constructs recommendations so as to balance a prescribed trade-off between information accuracy and persuasive effectiveness. On the other hand, we model driver decisions using an open quantum cognition model that considers driver attention as model parameter and incorporates the driver prior regarding road condition into the initial state. In Section III, we present a closed-form expression for the cognition matrix in the driver’s open quantum cognition model. Given that the agent rationalities are fundamentally different from each other (vehicle being a utility-maximizer, and driver following an open quantum cognition model), we also propose a novel equilibrium notion, inspired by Nash equilibrium, and compute both pure and mixed equilibrium strategies for the proposed driver-vehicle interaction game in Sections IV and V respectively. Finally, we analyze the impact of driver inattention on the equilibrium of the proposed game.

II Strategic Driver-Vehicle Interaction Model

In this section, we model the strategic interaction between a driver-assist system (car) and an inattentive driver as a one-shot Bayesian game. We assuming that the physical conditions of the road are classified into two states, namely, safe (denoted as $S$ ) and dangerous (denoted as $D$ ). The vehicle can choose one of the two signaling strategies: alert the driver (denoted as $A$ ), or no-alert (denoted as $N$ ) based on its belief about the road state. Meanwhile, based on the driver’s belief about the road state and his/her own mental state (which defines driver’s type), the driver chooses to either continue driving (denoted as $C$ ), or stop the vehicle (denoted as $S$ ). Note that although the letter $S$ is used to denote both road state being safe and driver decision being stop, the reader can easily decipher the notation’s true meaning from context.

Depending on the true road state, we assume that the vehicle (row player) and the driver (column player) obtain utilities as defined in Table I. When the road is dangerous, we expect the car to alert the driver. If the car does not alert, it will get a low payoff. Furthermore, we assume this low payoff depends on the driver’s action. If the driver stops, the payoff is only slightly low because no damage or injury is incurred. If the driver continues to drive, the payoff is very low because damage or injury is incurred. When the road is safe, the correct action for the car is not to alert. If the car does not alert, it will get a high payoff. This high payoff depends on the driver’s action. If the driver stops, the payoff is only slightly high because it does not help the driver and an unnecessary stop is waste of time and energy. If the driver continues to drive, the reward is very high because everything is fine.

	C	S
N	$a_{1,s}$ , $a_{2,s}$	$b_{1,s}$ , $b_{2,s}$
A	$c_{1,s}$ , $c_{2,s}$	$d_{1,s}$ , $d_{2,s}$

	C	S
N	$a_{1,d}$ , $a_{2,d}$	$b_{1,d}$ , $b_{2,d}$
A	$c_{1,d}$ , $c_{2,d}$	$d_{1,d}$ , $d_{2,d}$

TABLE I: Utilities of the car and the driver when the road is safe (left) and dangerous (right)

In this paper, we assume that both the car and the driver does not know the true road state. While the car relies on its observations from on-board sensors and other extrinsic information sources (e.g. nearby vehicles, road-side infrastructure) and its on-board machine learning algorithm for road judgment to construct its belief $q\in[0,1]$ regarding the road state being safe, we assume that the driver constructs a belief $p\in[0,1]$ regarding the road state being safe based on what he/she sees and his/her prior experiences. Furthermore, as in the case of a traditional decision-theoretic agent, we assume that the car seeks to maximize its expected payoff. If $p_{C}$ is the probability with which the driver chooses $C$ , then the expected payoff for choosing $N$ and $A$ at the car are respectively given by

\begin{array}[]{lcl}U_{N}(p_{C})&=&p_{C}\Big{[}a_{1,s}q+(1-q)a_{1,d}\Big{]}\\[6.45831pt] &&\qquad+\ (1-p_{C})\Big{[}b_{1,s}q+(1-q)b_{1,d}\Big{]}\end{array}

(1)

and

\begin{array}[]{lcl}U_{A}(p_{C})&=&p_{C}\Big{[}c_{1,s}q+(1-q)c_{1,d}\Big{]}\\[6.45831pt] &&\qquad+\ (1-p_{C})\Big{[}d_{1,s}q+(1-q)d_{1,d}\Big{]}.\end{array}

(2)

The calculation of $p_{C}$ is complicated by the fact that the driver exhibits bounded rationality. Fortunately, the bounded rationality can be characterized by the open quantum system cognition model, as described below.

II-A Driver’s Open Quantum Cognition Model

In this subsection, we present the basic elements of the open quantum system cognition model [22], and how it is applied to model driver behavior. The cognitive state of the agent is described by a mixed state or density matrix $\rho$ , which is a statistical mixture of pure states. Formally it is a Hermitian, non-negative operator, whose trace is equal to one. Under the Markov assumption (the evolution $\mathcal{E}$ can be factorized as $\mathcal{E}_{t_{2},t_{0}}=\mathcal{E}_{t_{2},t_{1}}\mathcal{E}_{t_{1},t_{0}}$ given a sequence of instants $t_{0}$ , $t_{1}$ , $t_{2}$ ), one can find the most general form of this time evolution based on a time local master equation $d\rho/dt=\mathcal{L}[\rho]$ , with $\mathcal{L}$ a differential superoperator (it acts over operators) called Lindbladian, which is defined as follows.

Definition 1 ([23]).

The Lindblad-Kossakowski equation for any open quantum system is defined as

\begin{array}[]{l}\displaystyle\frac{d\rho}{dt}=-i(1-\alpha)[H,\rho]+\displaystyle\alpha\sum_{m,n}\gamma_{m,n}\Big{[}L_{m,n}\rho L^{\dagger}_{m,n}\\[4.30554pt] \displaystyle\qquad\qquad\qquad\qquad\qquad\qquad-\frac{1}{2}\left\{L^{\dagger}_{m,n}L_{m,n},\ \rho\right\}\Big{]},\end{array}

(3)

where

•

$H$ is the Hamiltonian of the system,
•

$[H,\rho]=H\rho-\rho H$ is the commutation operation between the Hamiltonian $H$ and the density operator $\rho$ ,
•

$\gamma_{m,n}$ are $(m,n)^{th}$ entry of some positive semidefinite matrix (denoted as $C$ ),
•

$L_{m,n}$ is a set of linear operators,
•

$\left\{L^{\dagger}_{m,n}L_{m,n},\ \rho\right\}=L^{\dagger}_{m,n}L_{m,n}\rho+\rho L^{\dagger}_{m,n}L_{m,n}$ denotes the anticommutator. The superscript $\dagger$ represents the adjoint (transpose and complex conjugate) operation.

In this paper, we set

L_{(m,n)}=\ket{m}\bra{n}

(4)

as defined in [22], where, for any $m$ , $\ket{m}$ is a column vector whose $m$ th entry is 1 and the other entries are 0. Note that $\bra{n}$ is obtained by transposing $\ket{n}$ and then taking its complex conjugate. Thus $\bra{n}$ is a row vector whose $n$ th entry is 1 and 0 otherwise.

The second term on the right side of Equation (3) contains the dissipative term responsible for the irreversibility in the decision-making process [22], weighted by the coefficient $\alpha$ such that the parameter $\alpha\in[0,1]$ interpolates between the von Neumann evolution $(\alpha=0)$ and the completely dissipative dynamics $(\alpha=1)$ . Furthermore, the term $\gamma_{(m,n)}$ is the $(m,n)$ -th entry in the cognitive matrix $C(\lambda,\phi)$ . This cognitive matrix $C(\lambda,\phi)$ is formalized as the linear combination of two matrices $\Pi(\lambda)$ and $B$ , which are associated to the profitability comparison between alternatives and the formation of beliefs, respectively [22]:

\begin{array}[]{lcl}C(\lambda,\phi)&=&\begin{bmatrix}\gamma_{(1,1)}&\cdots&\gamma_{(1,N)}\\ \vdots&\ddots&\vdots\\ \gamma_{(N,1)}&\cdots&\gamma_{(N,N)}\end{bmatrix}\\[25.83325pt] &=&(1-\phi)\cdot\Pi^{T}(\lambda)\ +\ \phi\cdot B^{T},\end{array}

(5)

where $\phi\in[0,1]$ is a parameter assessing the relevance of the formation of beliefs during the decision-making process, $\Pi(\lambda)$ is the transition matrix where $(i,j)$ -th entry $\pi_{ij}(\omega_{l})$ is the probability that the decision maker switches from strategy $s_{i}$ to $s_{j}$ for a given state of the world $\omega_{l}$ , and $B$ matrix allows the driver to introduce a change of belief about the state of the world in the cognitive process by jumping from one connected component associated to a particular state of the world $\omega_{k}\in\Omega$ to the connected component associated to another one $\omega_{l}\in\Omega$ , while keeping the action $s_{i}$ fixed. The superscript $T$ denotes the transpose matrix. Finally, the dimension of the square matrix $C(\lambda,\phi)$ , i.e. $N$ , can be inferred from the detailed discussion given below.

\begin{array}[]{lcl}\Pi(\lambda)&=&\quad\begin{bmatrix}\mu(\lambda)&1-\mu(\lambda)&0&0&0&0&0&0\\ \mu(\lambda)&1-\mu(\lambda)&0&0&0&0&0&0\\ 0&0&\nu(\lambda)&1-\nu(\lambda)&0&0&0&0\\ 0&0&\nu(\lambda)&1-\nu(\lambda)&0&0&0&0\\ 0&0&0&0&\xi(\lambda)&1-\xi(\lambda)&0&0\\ 0&0&0&0&\xi(\lambda)&1-\xi(\lambda)&0&0\\ 0&0&0&0&0&0&o(\lambda)&1-o(\lambda)\\ 0&0&0&0&0&0&o(\lambda)&1-o(\lambda)\end{bmatrix}.\end{array}

(6)

At the driver, the world state primarily consists of two components: (i) the road condition, and (ii) the car’s action, i.e., the set of world states of the driver is $\Omega=\{SN,SA,DN,DA\}$ where the first letter represents road condition and the second letter represents car action. The utilities of the driver for choosing a strategy at a world state are as follows:

\begin{array}[]{lclclcl}u(C|SN)&=&a_{2,s},&&u(S|SN)&=&b_{2,s},\\[4.30554pt] u(S|SA)&=&c_{2,s},&&u(S|SA)&=&d_{2,s},\\[4.30554pt] u(C|DN)&=&a_{2,d},&&u(S|DN)&=&b_{2,d},\\[4.30554pt] u(S|DA)&=&c_{2,d},&&u(S|DA)&=&d_{2,d}.\end{array}

We choose the basis of the road-car-driver system spanning the space of states to be

\begin{array}[]{l}\Big{\{}\ket{e_{1}},\ket{e_{2}},\ket{e_{3}},\ket{e_{4}},\ket{e_{5}},\ket{e_{6}},\ket{e_{7}},\ket{e_{8}}\Big{\}}\\[8.61108pt] =\Big{\{}\ket{SNC},\ \ket{SNS},\ \ket{SAC},\ \ket{SAS},\ \ket{DNC},\ \ket{DNS},\\[4.30554pt] \qquad\qquad\qquad\ket{DAC},\ \ket{DAS}\Big{\}}.\end{array}

(7)

Next we define the transition matrix $\Pi(\lambda)$ . If the utility of the decision maker by choosing strategy $s_{i}$ at the world state of $\omega_{l}$ is $u(s_{i}|\omega_{l})$ , the transition probability that the decision maker would switch to strategy $s_{i}$ at time step $k+1$ from strategy $s_{j}$ at time step $k$ is given in the spirit of Luce’s choice axiom [24, 25, 26]:

\pi_{(s_{j}\rightarrow s_{i}|\omega_{l})}=P(s_{i}|s_{j},\omega_{l})=\frac{u(s_{i}|\omega_{l})^{\lambda}}{\displaystyle\sum_{j=1}^{N_{S}}u(s_{j}|\omega_{l})^{\lambda}},

(8)

where the exponent $\lambda\geq 0$ measures the decision maker’s ability to discriminate the profitability among the different options. When $\lambda=0$ , each strategy $s_{i}\in S$ has the same probability of being chosen ( $1/N_{S}$ ), and when $\lambda\rightarrow\infty$ only the dominant alternative is chosen. There are two implications in this formulation of $P(s_{i}|s_{j},\omega_{l})$ : (1) $u(s_{i}|\omega_{l})\geq 0$ to avoid negative $P(s_{i}|s_{j},\omega_{l})$ ; (2) $P(s_{i}|s_{j},\omega_{l})$ only depends on the destination $s_{i}$ and does not depend on the starting point $s_{j}$ .

Below are the probabilities needed for the $\Pi$ matrix:

\begin{array}[]{lr}\mu(\lambda)=\displaystyle\frac{a^{\lambda}_{2,s}}{a^{\lambda}_{2,s}+b^{\lambda}_{2,s}},&\nu(\lambda)=\displaystyle\frac{c^{\lambda}_{2,s}}{c^{\lambda}_{2,s}+d^{\lambda}_{2,s}},\\[8.61108pt] \xi(\lambda)=\displaystyle\frac{a^{\lambda}_{2,d}}{a^{\lambda}_{2,d}+b^{\lambda}_{2,d}},&o(\lambda)=\displaystyle\frac{c^{\lambda}_{2,d}}{c^{\lambda}_{2,d}+d^{\lambda}_{2,d}},\end{array}

where

•

$\mu(\lambda)$ is the probability that driver picks $C$ when he/she assumes that road state is $S$ and the car chooses $N$ ,
•

$\nu(\lambda)$ is the probability that driver will pick $C$ when he/she assumes that road state is $S$ and the car chooses $A$ ,
•

$\xi(\lambda)$ is the probability that driver will pick $C$ when he/she assumes that road state is $D$ and the car chooses $N$ ,
•

$o(\lambda)$ is the probability that driver will pick $C$ when he/she assumes that road state is $D$ and the car chooses $A$ .

Equation (6) puts all the terms together in a matrix form and demonstrates the physical meaning of the row and column labels in $\Pi(\lambda)$ .

The $H$ matrix in Equation (3) is set as in [22]. When the elements of the $\Pi(\lambda)$ is nonzero, the elements of $H$ in the same position is 1; Otherwise it is zero. Thus, the $H$ matrix is

\begin{array}[]{lcl}H&=&\begin{bmatrix}1&1&0&0&0&0&0&0\\ 1&1&0&0&0&0&0&0\\ 0&0&1&1&0&0&0&0\\ 0&0&1&1&0&0&0&0\\ 0&0&0&0&1&1&0&0\\ 0&0&0&0&1&1&0&0\\ 0&0&0&0&0&0&1&1\\ 0&0&0&0&0&0&1&1\end{bmatrix}\\[47.36096pt] &=&\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix}.\end{array}

(9)

In this paper, we set $\phi=0$ for the following two reasons: (1) Since the world state of the driver is mainly the action of the car and the action of the car is known when calculating the equilibrium, the driver does not need to form such a belief; (2) We are considering a one-shot game and we can assume the road condition does not change in one game, i.e., we are only considering short-time dynamic. The $B$ matrix is zeroed out and its content is not described here. Thus $C=\Pi$ and we set $\gamma_{m,n}=C_{m,n}$ in Equation (3).

II-B Pure and Mixed Strategy Equilibria

For the sake of simplicity, let us denote the car as Agent 1, and the driver as Agent 2 without any loss of generality. Since the car seeks to maximize its expected payoff given that the driver chooses a strategy $s_{2}\in\{C,S\}$ , it is natural that the car’s final response $FR_{1}(s_{2})$ is its best response that maximizes its expected payoff given in Equations (1) and (2), i.e.,

FR_{1}(s_{2})=BR_{1}(s_{2})\left(\triangleq\displaystyle\max_{s_{1}\in\{N,A\}}U_{s_{1}}(s_{2})\right).

On the contrary, driver’s decisions are governed by the open quantum system model. If we denote the steady-state solution of Equation (3) as $OQ_{pure}(s_{1};\alpha,\lambda)$ for a given car’s strategy $s_{1}\in\{A,N\}$ , the final response of the driver is defined as

FR_{2}(s_{1})=OQ_{pure}(s_{1};\alpha,\lambda),

where $\alpha$ and $\lambda$ are driver’s model parameters in Equations (3) and (8) respectively. Then the (pure-strategy) equilibrium of this game is defined as follows.

Definition 2.

A strategy profile $(s_{1}^{*},s_{2}^{*})\in\{A,N\}\times\{C,S\}$ is a pure strategy equilibrium if and only if $s_{1}^{*}=BR_{1}(s_{2}^{*})$ and $s_{2}^{*}=OQ_{pure}(s_{1}^{*};\alpha,\lambda)$ .

On the contrary, the concept of mixed strategy equilibrium is actually more natural to the open-quantum-system model since the solution tells the probability of taking various actions instead of indicating a particular action. The open quantum system model directly gives a mixed strategy. Let the mixed strategy of the driver is denoted as $\sigma_{2}=(p_{C},1-p_{C})$ where $p_{C}$ is the probability that the driver chooses to continue. Similarly, let the car’s mixed strategy be denoted as $\sigma_{1}=(p_{A},1-p_{A})$ , where $p_{A}$ is the probability that the car chooses to alert. Then, a mixed strategy profile is denoted as $(\sigma_{1},\sigma_{2})$ . In such a mixed strategy setting, the car’s final response is its best mixed-strategy response, i.e.

FR_{1}(\sigma_{2})=BR_{1}(\sigma_{2}).

Similarly, the final response of the driver is obtained from the steady-state solution of Eq. 3, i.e.

FR_{2}(\sigma_{1})=OQ_{mix}(\sigma_{1};\alpha,\lambda).

Then the mixed-strategy equilibrium of this game is defined as follows.

Definition 3.

A strategy profile $(\sigma_{1}^{*},\sigma_{2}^{*})$ is an mixed-strategy equilibrium if and only if $\sigma_{1}^{*}=BR_{1}(\sigma_{2}^{*})$ and $\sigma_{2}^{*}=OQ_{mix}(\sigma_{1}^{*};\alpha,\lambda)$ .

Refer to caption — Figure 1: Illustration of the car-driver interaction game

Note that the above equilibrium notions presented in Definitions 2 and 3 are novel and different from traditional equilibrium notions in game theory. This is because our game comprises of two different players: (i) the car modeled as an expected utility maximizer, and (ii) the driver modeled using open quantum cognition equation, as is illustrated in Figure 1. However, our equilibrium notions are both inspired from the traditional definition of Nash equilibrium, and are defined using players’ final responses as opposed to best responses in the Nash sense. By doing so, we can easily expand traditional equilibrium notions to any strategic setting where heterogeneous entities interact in a competitive manner.

III Driver’s Final Response

Note that the dependent variable $\rho$ in Equation (3) is a matrix. In order to obtain the analytical solution, we vectorize $\rho$ by stacking its columns one on another to obtain vector $\vv{\rho}$ . Thus, the vectorized version of Definition 1 is as follows.

Definition 4.

The vectorized form for Lindblad-Kossakowski equation is given by

\displaystyle\frac{d\vv{\rho}}{dt}=\left[-i(1-\alpha)\vv{H}+\alpha\vv{L}\right]\vv{\rho},

(10)

where $I_{N}$ is the $N\times N$ identity matrix,

\vv{H}=H\otimes I_{N}-I_{N}\otimes H^{T},

(11)

\vv{L}=\displaystyle\sum_{m,n}\gamma_{m,n}\Lambda_{m,n}

(12)

\Lambda_{m,n}=L_{m,n}\otimes L^{*}_{m,n}-\Phi_{m,n},

(13)

\Phi_{m,n}=\displaystyle\frac{1}{2}\Big{(}L^{\dagger}_{m,n}L_{m,n}\otimes I_{N}+I_{N}\otimes(L^{\dagger}_{m,n}L_{m,n})^{*}\Big{)},

(14)

with the superscript * representing taking the complex conjugate of all entries.

In the driver-car game presented in Section II, note that we have $N=8$ basis states as stated in Equation (7). We will first derive the sparse structure of $\vv{H}$ in Lemma 1.

Note that the symbol $\oplus$ means direct-sum while the symbol $\otimes$ means tensor-product. The following two simple examples show their difference.

\begin{bmatrix}a&b\\ c&d\end{bmatrix}\oplus\begin{bmatrix}e&f\\ g&h\end{bmatrix}=\begin{bmatrix}a&b&0&0\\ c&d&0&0\\ 0&0&e&f\\ 0&0&g&h\end{bmatrix}

\begin{bmatrix}a&b\\ c&d\end{bmatrix}\otimes\begin{bmatrix}e&f\\ g&h\end{bmatrix}=\begin{bmatrix}ae&af&be&bf\\ ag&ah&bg&bh\\ ce&cf&de&df\\ cg&ch&dg&dh\end{bmatrix}

Lemma 1.

If the Hamiltonian $H$ of the 8-dimensinoal Lindblad-Kossakowski equation is defined as

H=\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix}\oplus\begin{bmatrix}1&1\\ 1&1\end{bmatrix},

then its vectorized form $\vv{H}$ is given by

\vv{H}=J\oplus J\oplus J\oplus J,

(15)

where

J=\begin{bmatrix}X\oplus X\oplus X\oplus X&I_{8}\\ I_{8}&X\oplus X\oplus X\oplus X\end{bmatrix}

with $X=\begin{bmatrix}0&-1\\ -1&0\end{bmatrix}$ .

Proof.

By Equation (11), we only need to calculate $H\otimes I_{N}$ and $I_{N}\otimes H^{T}$ . Noting $N=8$ , we have

H\otimes I_{N}=\begin{bmatrix}I_{8}&I_{8}\\ I_{8}&I_{8}\end{bmatrix}\oplus\begin{bmatrix}I_{8}&I_{8}\\ I_{8}&I_{8}\end{bmatrix}\oplus\begin{bmatrix}I_{8}&I_{8}\\ I_{8}&I_{8}\end{bmatrix}\oplus\begin{bmatrix}I_{8}&I_{8}\\ I_{8}&I_{8}\end{bmatrix}

and

I_{N}\otimes H^{T}=I_{N}\otimes H=K\oplus K\oplus K\oplus K

where

K=\begin{bmatrix}\mathbf{1}\oplus\mathbf{1}\oplus\mathbf{1}\oplus\mathbf{1}&0\\ 0&\mathbf{1}\oplus\mathbf{1}\oplus\mathbf{1}\oplus\mathbf{1}\end{bmatrix}

with 1 the $2\times 2$ matrix whose elements are all 1.

Subtracting $I_{N}\otimes H^{T}$ from $H\otimes I_{N}$ blockwise then leads to the claimed $J$ . ∎

Remark 1.

The condition of Lemma 1 is just setting the Hamiltonian of the Lindblad-Kossakowski equation as in Equation (9). $\vv{H}$ is a sparse block diagonal matrix with four blocks, each being $J$ . $J$ is a sparse matrix consists of four blocks where the off-diagonal blocks are identity matrices and the diagonal matrices are again block diagonal matrices. Such a special structure results from stacking the columns of the all-one matrices.

Theorem 1 presents the special sparse structure of $\vv{L}$ . To prove Theorem 1, Lemma 2 is needed. Lemma 2 gives the sparse structure of $\Lambda_{m,n}$ .

Lemma 2.

The $(M,N)^{th}$ entry of the matrix $\Lambda_{m,n}$ with $m\neq n$ is given by

\Lambda_{m,n}(M,N)=\begin{cases}-\displaystyle\frac{1}{2},&\text{ if }M=N=8(n-1)+k\\ &\text{ or }M=N=8(k-1)+n,\\ &k\in\{1,2,\cdots,8\}\setminus\{n\}\\[8.61108pt] -1&\text{ if }M=N=9n-8\\[8.61108pt] 1&\text{ if }M=9m-8,N=9n-8\\[8.61108pt] 0&\text{ otherwise}\end{cases}

(16)

The $(M,N)^{th}$ entry of the matrix $\Lambda_{m,n}$ with $m=n$ is given by

\Lambda_{m,n}(M,N)=\begin{cases}-\displaystyle\frac{1}{2},&\text{ if }M=N=8(n-1)+k\\ &\text{ or }M=N=8(k-1)+n,\\ &k\in\{1,2,\cdots,8\}\setminus\{n\}\\[8.61108pt] 0&\text{ otherwise}\end{cases}

(17)

Proof.

$L_{m,n}=\ket{m}\bra{n}$ is a real matrix, so $L^{*}_{m,n}=L_{m,n}$ and $L^{\dagger}_{m,n}=L^{T}_{m,n}$ .

Since only the $(m,n)$ entry of $L_{m,n}$ is 1 and the others are 0, $L_{m,n}\otimes L^{*}_{m,n}$ is a $64\times 64$ matrix with all entries zero except the $(8(m-1)+m,8(n-1)+n)$ entry, which is 1. Note that $m$ and $n$ range from 1 to 8.

Since the $(n,n)$ entry of $L^{\dagger}_{m,n}L_{m,n}$ is 1 and the other entries are 0, $L^{\dagger}_{m,n}L_{m,n}\otimes I_{8}$ is a 64 $\times$ 64 matrix whose entries are all zero except the $[8(n-1)+1]$ th to the 8 $n$ th diagonal entries (which are 1), and $I_{8}\otimes(L^{\dagger}_{m,n}L_{m,n})^{*}$ is a 64 $\times$ 64 matrix whose entries are all zero except the $(M,M)$ entries (which are 1) with $M=8(k-1)+n$ , $k=1,2,...,8$ for each fixed $n$ . Thus by Equation (14), $\Phi_{m,n}$ is a 64 $\times$ 64 matrix whose entries are all zero except the $(M,M)$ entries with $M=8(n-1)+k$ or $M=8(k-1)+n$ , $k=1,2,...,8$ for each fixed $n$ . The $(M,M)$ entries are 1/2 when $8(n-1)+k\neq 8(k-1)+n$ and is 1 when $8(n-1)+k=8(k-1)+n$ .

By Equation (13), subtracting $\Phi_{m,n}$ from $L_{m,n}\otimes L^{*}_{m,n}$ leads to the claimed result: When $m\neq n$ , there is no cancellation of nonzero entries between $\Phi_{m,n}$ and $L_{m,n}\otimes L^{*}_{m,n}$ . When $m=n$ , only the $(9n-8,9n-8)$ entry of $L_{m,n}\otimes L^{*}_{m,n}$ is nonzero (which is 1). The $(9n-8,9n-8)$ entry of $\Phi_{m,n}$ also 1. Thus the resultant only has 14 nonzero entries. ∎

Remark 2.

Note that $\Lambda_{m,n}$ does not mean the $(m,n)$ entry of $\Lambda$ . $\Lambda_{m,n}$ is itself a matrix. There are 64 such matrices and they will be weighed by $\gamma_{m,n}$ and summed. Then $(M,N)$ entry of $\Lambda_{m,n}$ depend on $m$ , $n$ , $M$ , and $N$ . $\Lambda_{m,n}$ is very sparse. The nonzero entries can only take $\pm 1$ and $-1/2$ since the building blocks $L_{m,n}$ and $I_{N}$ only has 1 as nonzero entry value. Given $m$ and $n$ , the $(M,N)$ entries with $M=N=8(n-1)+k$ or $M=N=8(n-1)+k$ are special since either $L_{m,n}\otimes L^{*}_{m,n}$ or $\Phi_{m,n}$ takes nonzero values at these entries.

Next we will multiply the $\Lambda_{m,n}$ obtained in Lemma 2 with $\gamma_{m,n}$ and sum over all $m$ and $n$ to obtain $\vv{L}$ in Theorem 1.

Theorem 1.

Let the coefficients $\gamma_{m,n}$ in the 8-dimensinoal Lindblad-Kossakowski equation be the $(m,n)$ entries of the matrix (ref. to Equation (6))

C=\Pi(\lambda)=M[\mu(\lambda)]\oplus M[\nu(\lambda)]\oplus M[\xi(\lambda)]\oplus M[o(\lambda)],

where $M[a]$ is of the form

M[a]=\begin{bmatrix}a&a\\ 1-a&1-a\end{bmatrix}.

Then, the $(M,N)^{th}$ entries of $\vv{L}$ within the vectorized Lindblad-Kossakowski equation (ref. to Def. 4) with $M=N$ are given by

\vv{L}_{M,N}=\begin{cases}-\displaystyle\frac{1}{2}\ (C_{n+1,n}+C_{l+1,l}),&n\neq l,n\text{ is odd}\\[4.30554pt] -\ C_{n+1,n},&n=l,n\text{ is odd}\\[4.30554pt] -\displaystyle\frac{1}{2}\ (C_{n-1,n}+C_{l-1,l}),&n\neq l,n\text{ is even}\\[4.30554pt] -\ C_{n-1,n},&n=l,n\text{ is even}\end{cases}

where $n=\lfloor\frac{M-1}{8}+1\rfloor$ and $l=(M-1)\mod 8+1$ , and the $(M,N)^{th}$ entries of $\Gamma$ with $M\neq N$ are given by

\vv{L}_{M,N}=\begin{cases}C_{n+1,n},&M=9n+1,N=9n-8,n=1,3,5,7\\[4.30554pt] C_{n-1,n},&M=9n-17,N=9n-8,n=2,4,6,8\\[4.30554pt] 0,&\text{otherwise}\end{cases}

Proof.

Interested readers may refer to Appendix A. ∎

Remark 3.

$\vv{L}_{M,N}$ depends on $M$ and $N$ . The expression of $\vv{L}_{M,N}$ must consist of entries of $C$ . Theorem 1 just reveals explicitly these relations. The entries of $C$ appearing in the expression of $\vv{L}_{M,N}$ are $C_{n,n}$ and $C_{n\pm 1,n}$ where $n=\lfloor\frac{M-1}{8}+1\rfloor$ or $n=(M-1)\mod 8+1$ . Such relations arise due to vectorization (stacking columns). Dividing by 8 and mode 8 appear since each column to be stacked is 8-dimensional. Despite summation over all $m$ and $n$ , at most two entries of $C$ appear in $\vv{L}_{M,N}$ since $C=\Pi(\lambda)$ is itself sparse.

Next we will combine the $\vv{H}$ obtained in Lemma 1 and the $\vv{L}_{m,n}$ obtained in Theorem 1 to obtain $-i(1-\alpha)\vv{H}+\alpha\vv{L}$ in Corollary 1.

Corollary 1.

If the coefficients $\gamma_{m,n}$ of the 8 dimensional Lindblad-Kossakowski equation is set as the $(m,n)$ entries of

C=\Pi(\lambda)=M(\mu(\lambda))\oplus M(\nu(\lambda))\oplus M(\xi(\lambda))\oplus M(o(\lambda)),

where $M(a)$ is a matrix in the form of

M(a)=\begin{bmatrix}a&a\\ 1-a&1-a\end{bmatrix},

then

-i(1-\alpha)\vv{H}+\alpha\vv{L}=A_{1}\oplus A_{2}\oplus A_{3}\oplus A_{4}

where

A_{i}=\begin{bmatrix}B_{i1}\oplus B_{i2}\oplus B_{i3}\oplus B_{i4}&-i(1-\alpha)I_{8}+\alpha E_{i}\\ -i(1-\alpha)I_{8}+\alpha D_{i}&B_{i5}\oplus B_{i6}\oplus B_{i7}\oplus B_{i8}\end{bmatrix}.

$D_{i}$ and $E_{i}$ are 16 $\times$ 16 matrices. They both have only one nonzero entry. The nonzero entries are taken from the cognition matrix $C$ :

\begin{array}[]{ccc}D_{1}(2,1)=C_{2,1},&E_{1}(1,2)=C_{1,2},&D_{2}(4,3)=C_{4,3},\\[8.61108pt] E_{2}(3,4)=C_{3,4},&D_{3}(6,5)=C_{6,5},&E_{3}(5,6)=C_{5,6},\\[8.61108pt] D_{4}(8,7)=C_{8,7},&E_{4}(7,8)=C_{7,8}.&\end{array}

The $B_{ij}$ ’s are 4 $\times$ 4 matrices:

B_{ii}=\displaystyle F_{i}-\frac{\alpha}{2}\begin{bmatrix}C_{2i,2i-1}-C_{2i-1,2i-1}&0\\ 0&C_{2i-1,2i}+C_{2i,2i}\end{bmatrix},

B_{i(i+4)}=G_{i}-\frac{\alpha}{2}\begin{bmatrix}C_{2i-1,2i-1}+C_{2i,2i-1}&0\\ 0&C_{2i-1,2i}-C_{2i,2i}\end{bmatrix},

B_{ij}=F_{i}-\frac{\alpha}{2}\begin{bmatrix}C_{2j-1,2j-1}+C_{2j,2j-1}&0\\ 0&C_{2j-1,2j}+C_{2j,2j}\end{bmatrix},

B_{i(j+4)}=G_{i}-\frac{\alpha}{2}\begin{bmatrix}C_{2j-1,2j-1}+C_{2j,2j-1}&0\\ 0&C_{2j+1,2j}+C_{2j,2j}\end{bmatrix}

for $i\neq j$ , $i=1,2,3,4$ , $j=1,2,3,4$ , where

F_{i}=i(1-\alpha)\begin{bmatrix}0&1\\ 1&0\end{bmatrix}-\frac{\alpha}{2}(C_{2i-1,2i-1}+C_{2i,2i-1})I_{2},

G_{i}=i(1-\alpha)\begin{bmatrix}0&1\\ 1&0\end{bmatrix}-\frac{\alpha}{2}(C_{2i-1,2i}+C_{2i,2i})I_{2}.

Remark 4.

The Lindblad-Kossakowsi equation itself is not a cognition model since its coefficients $\gamma_{m,n}$ are quite general. The open quantum cognition model is built by setting the $\gamma_{m,n}$ as $(m,n)$ entry of the cognition matrix $C$ . The condition in Corollary 1 is just setting $\phi=0$ in Equation (5) and using the $\Pi(\lambda)$ prescribed in Equation (6). This is exactly the scenario of the car-driver game.

Remark 5.

The vectorized operator $-i(1-\alpha)\vv{H}+\alpha\vv{L}$ of the vectorized Lindblad-Kossakowski equation is a block diagonal matrix with four blocks. The four blocks have very similar structures. Each block is actually quite sparse since each block is a block matrix with totally four sub-blocks and the two off-diagonal sub-blocks are almost identity matrix (only one entry is different).

IV Pure-strategy equilibrium

The diagonal elements of the steady-state solution $\rho$ of Equation (3) are just $Pr(SNC),Pr(SNS),\cdots,Pr(DAS)$ . Then we can calculate the probability for the driver to continue as

Pr(C)=Pr(SNC)+Pr(SAC)+Pr(DNC)+Pr(DAC).

(18)

Let $p$ be the probability that the driver judges the road to be safe before knowing the car’s action and $U_{2}$ be the utility function of the driver. In this paper, we model driver’s pure strategy $s_{2}$ as the output of the open quantum cognition model parameters $\alpha$ and $\lambda$ taking the pure strategy of the car $s_{1}$ as input:

s_{2}=OQ_{pure}(s_{1};\Theta)=\begin{cases}C,&\text{ if }Pr(C)\geq 0.5,\\[8.61108pt] S,&\text{ if }Pr(C)<0.5.\end{cases}

(19)

where $\Theta=(\alpha,\lambda,p,U_{2})$ is the parameter tuple of the open quantum model.

Remark 6.

In this paper, we use $Pr(C)$ in two different ways to obtain pure and mixed strategy equilibria. We obtain a pure strategy at the driver by employing a hard threshold on $Pr(C)$ (in our case, Continue if $Pr(C)\geq 0.5$ , Stop otherwise). By treating $Pr(C)$ as the driver’s mixed strategy in Section V, we will obtain the mixed-strategy equilibrium.

We set the initial density matrix as $\rho_{0}=\ket{\Psi_{0}}\bra{\Psi_{0}}$ , where

\ket{\Psi_{0}}=\sqrt{p/2}(\ket{e_{3}}+\ket{e_{4}})+\sqrt{(1-p)/2}(\ket{e_{7}}+\ket{e_{8}})

when the car action is A and

\ket{\Psi_{0}}=\sqrt{p/2}(\ket{e_{1}}+\ket{e_{2}})+\sqrt{(1-p)/2}(\ket{e_{5}}+\ket{e_{6}})

when the car action is N, with $\ket{e_{i}}$ prescribed in Subsection II-A. The calculation of the generalized pure-strategy equilibrium is similar to that of the Nash equilibrium. We simply replace the best response with the final response. We loop over the car strategies. In the loop, the car strategy is the input of the open quantum model and a driver strategy is the output. If the car strategy is the best response with the outputted driver strategy, then the strategy profile is outputted as pure-strategy equilibrium. Algorithm 1 lists the procedures of calculating the pure-strategy equilibrium.

	C	S
N	$a_{1,s}=85$ , $a_{2,s}=85$	$b_{1,s}=75$ , $b_{2,s}=50$
A	$c_{1,s}=40$ , $c_{2,s}=85$	$d_{1,s}=50$ , $d_{2,s}=50$

	C	S
N	$a_{1,d}=25$ , $a_{2,d}=25$	$b_{1,d}=30$ , $b_{2,d}=60$
A	$c_{1,d}=75$ , $c_{2,d}=25$	$d_{1,d}=85$ , $d_{2,d}=85$

TABLE II: Utilities used in our numerical results when the road is safe (above) and dangerous (below)

Furthermore, in our numerical evaluation, we assume the utilities at both the car and the driver as shown in Table II. In addition to the case of a driver-conscient car, we consider a benchmark case where the car does not care about the driver and makes decisions solely based on its prior, i.e., alert if $q<0.5$ and does not alert if $q\geq 0.5$ . In this benchmark case, the final response of the car is independent of the driver’s strategy. The equilibrium points of the driver-car games with a driver-agnostic car and with a driver-conscient car (driver making decisions according to open quantum model with $\lambda=10,\alpha=0.2$ ) under various prior beliefs on road condition are shown in Fig. 2. When both the driver and car are sure of safety, the equilibrium is $(N,C)$ . When both the driver and car are sure of danger, the equilibrium is $(A,S)$ . When the driver is sure of safety but the car is sure of danger, the equilibrium is $(A,C)$ . When the driver is sure of danger but the car is sure of safety, the equilibrium is $(N,S)$ . The division line is not $p=q=0.5$ . (S, A) has the largest area. When the car is driver-agnostic, the border between Not Alert and Alert in the equilibrium plot is always $q=0.5$ regardless of the equilibrium strategy of the driver. When the car is driver-conscient, the border between Not Alert and Alert depends on the equilibrium strategy of the driver (or equivalently, road prior of the driver): the border is located close to $q=0.7$ when $p\leq 0.50$ and the border is located close to $q=0.52$ when $p\geq 0.52$ .

input : Parameters:

\alpha,\lambda

; prior about road safety of the driver:

p

; prior about road safety of the car:

q

; Utility function of the car:

U_{1}(s_{1},s_{2},r)

where

r=S

means safe and

r=D

means dangerous; Utilities of the driver:

U_{2}

output : Pure-strategy equilibrium:

S^{*}

// Empty set means no equilibrium

S^{*}=\emptyset

;

for $s_{1}\;\mathbf{in}\;\{N,A\}$ do

\bar{s}_{1}=\mathrm{element\;of\;}\{N,A\}\setminus\{s_{1}\}

;

s_{2}=OQ_{pure}(s_{1};\alpha,\lambda)

;

u=qU_{1}(s_{1},s_{2},S)+(1-q)U_{1}(s_{1},s_{2},D)

;

\bar{u}=qU_{1}(\bar{s}_{1},s_{2},S)+(1-q)U_{1}(\bar{s}_{1},s_{2},D)

;

if $u\geq\bar{u}$ then

S^{*}=S^{*}\cup\{s_{1},s_{2}\}

;

Algorithm 1 Calculating the pure-strategy equilibrium of the car-driver game

The equilibrium points of the driver-car game with $\lambda$ = 0, 1, 2, 3, 4, 10 and $\alpha$ = 0.8 under various prior beliefs on road condition are shown in Fig. 3 (C: Continues, S: Stop, A: Alert, N: Not Alert). When $\lambda$ drops from 10 to 4, the border between S and C shifts from left to right. When $\lambda$ drops from 4 to 3, the border between (S, A) and (C, A) shifts from left to right and a region with two equilibrium points appears inside the region of (C, N). The two equilibria are (C, N) and (S, A). When $\lambda$ drops from 4 to 3 and from 3 to 2, the border between (S, A) and (C, A) shifts from left to right and the region with two equilibrium points enlarges with the border shift. When $\lambda$ drops from 2 to 0, the driver can no longer distinguish the utilities. The $(C,N)$ region is merged into the $(S,N)$ region and the two-equilibrium region is merged into the $(S,A)$ region. The border between $(S,A)$ and $(C,A)$ shifts from left to right and a new no-equilibrium region appears inside the previous $(C,N)$ region.

Remark 7.

When $\lambda=0$ , the driver cannot distinguish the utilities at all and is completely random, so the concept of final response does not apply. The type of pure-strategy equilibrium strongly aligns with the priors of the driver and the car. The desired equilibria are $(C,N)$ and $(S,A)$ , where the driver’s action is in harmony with car’s action.

Remark 8.

Since Fig. 2 and Fig. 3 are plotted over $(p,q)$ axes, we can find out which type of equilibrium is most common. With the prescribed utilities, the most common pure-strategy equilibrium is $(S,A)$ . This is the most favorable equilibrium, since following the car’s recommendation in the dangerous road can save life.

Remark 9.

As the driver’s ability to distinguish utilities weakens ( $\lambda$ decreases), $(S,A)$ becomes more likely. This means that the driver follows the car’s advice diligently especially when he/she is incapable of making decisions on a dangerous road.

V Mixed-strategy equilibrium

When calculating the mixed-strategy equilibrium, $p$ and $p_{A}$ appear in the initial state of the open quantum model since the mixed-strategy of the car is completely determined by $p_{A}$ (ref. to Subsecion II-B). Theorem 2 will give a closed-form expression of $Pr(C)$ by solving the vectorized Lindblad-Kossakowski equation (ref. to Definition 4).

Theorem 2.

Let the initial density matrix be given as $\rho_{0}=\ket{\Psi_{0}}\bra{\Psi_{0}},$ where

\ket{\Psi_{0}}=\sqrt{p(1-p_{A})/2}(\ket{e_{1}}+\ket{e_{2}})+\sqrt{pp_{A}/2}(\ket{e_{3}}+\ket{e_{4}})+\sqrt{(1-p)(1-p_{A})/2}(\ket{e_{5}}+\ket{e_{6}})+\sqrt{(1-p)p_{A}/2}(\ket{e_{7}}+\ket{e_{8}}).

The probability that the driver chooses to continue is

\displaystyle Pr(C)=\frac{2(1-\alpha)^{2}}{c}+\frac{\alpha^{2}}{c}r+rh(t)+\Big{(}\frac{1}{2}-\frac{2(1-\alpha)^{2}}{c}\Big{)}e^{-\alpha t}\cos\left[2(1-\alpha)t\right]-\frac{\alpha(1-\alpha)}{c}e^{-\alpha t}\sin\left[2(1-\alpha)t\right],

where $c=\alpha^{2}+4(1-\alpha)^{2}$ , $r=p(1-p_{A})C_{1,2}+pp_{A}C_{3,4}+(1-p)(1-p_{A})C_{5,6}+(1-p)p_{A}C_{7,8}$ , and

h(t)=\displaystyle\frac{\alpha}{c}e^{-\alpha t}\Big{\{}2(1-\alpha)\sin\left[2(1-\alpha)t\right]-\alpha\cos\left[2(1-\alpha)t\right]\Big{\}}.

Proof.

Interested readers may refer to Appendix B. ∎

Remark 10.

The output of the open quantum model, $Pr(C)$ , presented in Theorem 2 consists of both transient and stationary parts. The transient part consists of sine and cosine multiplied with exponential decay. Thus, there always exists steady state when the exponential decay rate $\alpha>0$ . When $\alpha=0$ or $\lambda=0$ , $Pr(C)=0$ and the driver is completely random and absent-minded. Furthermore, a driver with a higher $\alpha$ can make a decision faster. Thus $\alpha$ also represents the brain power and attentiveness of the driver. On the other hand, the parameter $\lambda$ appears only within the $C_{ij}$ terms, which are linearly weighted by monomial terms of $p$ and $p_{A}$ . This is essentially prior probability multiplied with likelihood, or initial probability multiplied with transition probability. Viewing $\alpha$ and $p_{A}$ as constants, the steady-state $Pr(C)$ is a linear function of the driver’s prior $p$ .

For brevity, the mixed-strategy equilibrium is denoted as ( $p_{A}^{*},p_{C}^{*}$ ). If the car knows that the driver will play $p_{C}^{*}$ , then its expected payoffs from playing alert and no alert must be equal, otherwise, it either chooses to alert only or chooses not to alert only and does not need to mix between them. Thus we have

\begin{split}p_{C}^{*}[a_{1,s}q+a_{1,d}(1-q)]+(1-p_{C}^{*})[b_{1,s}q+b_{1,d}(1-q)]\\ =p_{C}^{*}[c_{1,s}q+c_{1,d}(1-q)]+(1-p_{C}^{*})[d_{1,s}q+d_{1,d}(1-q)].\end{split}

(20)

Solving for $p_{C}^{*}$ , we obtain

p_{C}^{*}=\frac{q\Delta_{s}+(1-q)\Delta_{d}}{q(\Delta_{s}+c_{1,s}-a_{1,s})+(1-q)(\Delta_{d}+c_{1,d}-a_{1,d})}.

(21)

where $\Delta_{s}=b_{1,s}-d_{1,s}$ and $\Delta_{d}=b_{1,d}-d_{1,d}$ .

For the sake of illustration, we consider the bi-matrix game presented in Table I. Upon substituting the utility values in Table II for this example in Equation (21), we obtain

\begin{array}[]{lcl}p_{C}^{*}&=&\displaystyle\frac{11-16q}{3q+1}.\end{array}

(22)

Note that the above $p_{C}^{*}$ maybe outside [0, 1]. If so, there is no mixed-strategy equilibrium. In order for $p_{C}^{*}$ to lie within [0, 1] under the prescribed utilities, $q$ must lie within [10/19, 11/16]. This is a very narrow range of $q$ . Given $p_{C}^{*}$ , the car can assign any $p_{A}$ to A because A and N give the same payoff. Next we need to search the $p_{A}$ that produce $p_{C}^{*}$ . Such a $p_{A}$ is just the desired $p_{A}^{*}$ .

Since $p_{C}^{*}$ is completely determined by $q$ , it is more convenient to plot $p_{A}^{*}$ versus $p$ and $p_{C}^{*}$ . $p_{A}^{*}$ versus $p$ and $p_{C}^{*}$ with various $\lambda$ s is shown in Fig. 4. The mixed-strategy equilibria only exist in a narrow band extending from a low- $p_{C}^{*}$ -low- $p$ region to a high- $p_{C}^{*}$ -high- $p$ region. There may not exist a mixed -strategy equilibrium for a given $q$ , but there always exists one for a given $p$ . When $p_{C}^{*}$ and $p$ increase, the band gets narrower. Within the band, the gradient of $p_{A}^{*}$ is perpendicular to the band, i.e., $p_{A}^{*}$ increases when $p$ increases and $p_{C}^{*}$ decreases simultaneously. When $\lambda$ decreases, the band gets flatter and the band firstly widens and then narrows. As $\alpha$ decreases, the band gets flatter and narrower.

Remark 11.

In the case of mixed equilibria, when the driver is attentive, the equilibrium strategy $P^{*}_{C}$ is well aligned with her prior (higher $p$ , higher $P^{*}_{C}$ ) as shown in Figure 4. However, when the driver gradually loses her attention ( $\alpha$ or $\lambda$ decreases), $P^{*}_{C}$ steadily approaches to 0.5 regardless of $p$ . This means that the driver becomes uncertain to choose $C$ or $S$ at equilibrium, when she is inattentive.

VI Conclusion and Future Work

In this paper, we developed a strategic driver-assist system based on a novel vehicle-driver interaction game. While the car is modeled as an expected utility maximizer, the driver is characterized by open-quantum cognition model which models his/her attentiveness ( $\alpha$ ) as well as sensitivity to the utilities ( $\lambda$ ). Based on a novel equilibrium concept proposed to solve any general human-system interaction game, we showed that both the car and the driver employ a threshold-based rule on their respective priors regarding the road state at equilibrium. Through numerical results, we also demonstrated how these thresholds vary under different settings based on road conditions and agent behavior. Specifically, in our proposed framework, we showed that an inattentive driver $(\lambda\leq 1)$ would stop the car in about 65% of all possible belief profile settings, and at least in 77% of belief profiles settings when the car alerts the driver. On the contrary, if there were no driver-assist system in the car, an inattentive driver would have stopped the car only in 50% of all possible belief profiles settings (the region where $p<0.5$ ), and in about 38.5% of all scenarios if the car were to alert the driver using our driver-assist system. At the same time, our proposed driver-assist system has improved persuasive ability by taking into account driver behavior, in addition to its inferences regarding the road state. This improvement in performance was demonstrated by the increase in threshold on a driver-conscient car’s belief, as opposed to that of a driver-agnostic car. Furthermore, we also proved that there always exists a mixed strategy equilibrium for any given driver’s prior, but only under a small range within car’s prior values. As the driver loses attention, we demonstrated that the mixed strategy of the driver at equilibrium drifts towards uniformly distributed probabilistic decisions. In the future, we will investigate repeated interaction games where the car can learn driver’s model parameters over multiple iterations. Furthermore, we will also incorporate $B$ matrix in the Lindblad model to account for the effects of mental deliberation in resolving conflicts between his/her own prior and the car’s signal.

VII Acknowledgment

Dr. S. N. Balakrishnan just passed away before this paper is ready to submit. However, Dr. Balakrishnan participated in every aspect through the research related to this paper. Therefore, we decided to still keep Dr. Balakrishnan as the co-author of this paper.

References

[1] National Highway Traffic Safety Administration. Overview of Motor Vehicle Crashes in 2019 . https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813060, 2020. [Online; accessed 26-February-2020].
[2] BBC. Tesla Autopilot crash driver ’was playing video game’. https://www.bbc.com/news/technology-51645566, 2020. [Online; accessed 26-February-2020].
[3] TOM KRISHER. NTSB: Autopilot flaw, driver inattention caused 2018 Tesla crash in Culver City’. https://www.ocregister.com/2019/09/04/ntsb-tesla-autopilot-let-driver-rely-too-much-on-automation/, 2019. [Online; accessed 04-September-2019].
[4] Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. Standard, SAE International Surface Vehicle Recommended Practice, June 2018.
[5] TOM KRISHER. 3 crashes, 3 deaths raise questions about Tesla’s Autopilot’. https://apnews.com/article/ca5e62255bb87bf1b151f9bf075aaadf, 2020. [Online; accessed 03-January-2020].
[6] Philip Koopman and Michael Wagner. Autonomous vehicle safety: An interdisciplinary challenge. IEEE Intelligent Transportation Systems Magazine, 9(1):90–96, 2017.
[7] Marcel Woide, Dina Stiegemeier, and Martin Baumann. A methodical approach to examine conflicts in context of driver-autonomous vehicle-interaction. 2019.
[8] Jong Kyu Choi and Yong Gu Ji. Investigating the importance of trust on adopting an autonomous vehicle. International Journal of Human-Computer Interaction, 31(10):692–702, 2015.
[9] Morimichi Nishigaki and Tetsuro Shirakata. Driver attention level estimation using driver model identification. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pages 3520–3525. IEEE, 2019.
[10] Alex Zyner, Stewart Worrall, James Ward, and Eduardo Nebot. Long short term memory for driver intent prediction. In 2017 IEEE Intelligent Vehicles Symposium (IV), pages 1484–1489. IEEE, 2017.
[11] Genya Abe and John Richardson. Alarm timing, trust and driver expectation for forward collision warning systems. Applied ergonomics, 37(5):577–586, 2006.
[12] Amos Tversky and Daniel Kahneman. Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and uncertainty, 5(4):297–323, 1992.
[13] Daniel Ellsberg. Risk, ambiguity, and the savage axioms. The quarterly journal of economics, pages 643–669, 1961.
[14] Jerome R Busemeyer and James T Townsend. Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. Psychological review, 100(3):432, 1993.
[15] Riccardo Franco. The conjunction fallacy and interference effects. Journal of Mathematical Psychology, 53(5):415–422, 2009.
[16] Sean D Young, A David Nussbaum, and Benoît Monin. Potential moral stigma and reactions to sexually transmitted diseases: Evidence for a disjunction fallacy. Personality and Social Psychology Bulletin, 33(6):789–799, 2007.
[17] Andrei Yu Khrennikov and Emmanuel Haven. Quantum mechanics and violations of the sure-thing principle: The use of probability interference and other concepts. Journal of Mathematical Psychology, 53(5):378–388, 2009.
[18] M. Flad, L. Fröhlich, and S. Hohmann. Cooperative shared control driver assistance systems based on motion primitives and differential games. IEEE Transactions on Human-Machine Systems, 47(5):711–722, 2017.
[19] Xiaoxiang Na and David J Cole. Game-theoretic modeling of the steering interaction between a human driver and a vehicle collision avoidance controller. IEEE Transactions on Human-Machine Systems, 45(1):25–38, 2014.
[20] Nathan Lutes, Venkata Sriram Siddhardh Nadendla, and S. N. Balakrishnan. Perfect Bayesian Equilibrium in Strategic Interaction between Intelligent Vehicles and Inattentative Drivers. 2021. Working Paper.
[21] Wilko Schwarting, Alyssa Pierson, Javier Alonso-Mora, Sertac Karaman, and Daniela Rus. Social behavior for autonomous vehicles. Proceedings of the National Academy of Sciences, 116(50):24972–24978, 2019.
[22] Ismael Martínez-Martínez and Eduardo Sánchez-Burillo. Quantum stochastic walks on networks for decision-making. Scientific reports, 6:23812, 2016.
[23] Angel Rivas and Susana F Huelga. Open quantum systems, volume 13. Springer, 2012.
[24] R Duncan Luce. The choice axiom after twenty years. Journal of mathematical psychology, 15(3):215–233, 1977.
[25] R Duncan Luce. Individual choice behavior: A theoretical analysis. Courier Corporation, 2012.
[26] John I Yellott Jr. The relationship between luce’s choice axiom, thurstone’s theory of comparative judgment, and the double exponential distribution. Journal of Mathematical Psychology, 15(2):109–144, 1977.

Appendix A Proof of Theorem 1

There are totally 64 terms of $\gamma_{m,n}\Lambda_{m,n}$ to be summed in Equation (10) when $\gamma_{m,n}=C_{m,n}$ . We only need to consider terms with $C_{m,n}\neq 0$ . The nonzero entries of $C_{m,n}$ are: $(n,n)$ and $(n+1,n)$ when $n$ is odd, $(n,n)$ and $(n-1,n)$ when $n$ is even. There are only 16 nonzero terms.

Consider the $(n,n)$ terms. According to Lemma 2, $C_{n,n}\Lambda_{n,n}$ has only 14 nonzero entries. They are all $-C_{n,n}/2$ at the $(M,M)$ entries where $M=8(n-1)+k_{1}$ or $M=8(k_{2}-1)+n$ , $k_{1}$ , $k_{2}=1,2,...,8$ but $k_{1}$ and $k_{2}$ cannot be $n$ at the same time. Each $(M,M)$ entry of $\sum_{n=1}^{8}C_{n,n}\Lambda_{n,n}$ has exactly two $C_{n,n}\Lambda_{n,n}$ ’s contributing to it: one with $n=(M-1)\mod 8+1$ , and the other with $n=\lfloor\frac{M-1}{8}+1\rfloor$ . That is, the $(M,M)$ entry of $\sum_{n=1}^{8}C_{n,n}\Lambda_{n,n}$ is

\Big{(}\sum_{n=1}^{8}C_{n,n}\Lambda_{n,n}\Big{)}_{M,M}=\begin{cases}(C_{n,n}+C_{l,l})/2,&n\neq l\\ 0,&n=l\end{cases}

where $n=\lfloor\frac{M-1}{8}+1\rfloor$ and $l=(M-1)\mod 8+1$ . Note that $n=l$ when $M=9n-8$ .

Next consider the $(n\pm 1,n)$ terms. According to Lemma 2, $C_{n\pm 1,n}\Lambda_{n\pm 1,n}$ has 16 nonzero entries. 14 of them are $(M,M)$ entries taking value $-C_{n\pm 1,n}/2$ where $M=8(n-1)+k_{1}$ or $M=8(k_{2}-1)+n$ , $k_{1}$ , $k_{2}=1,2,...,8$ but $k_{1}$ and $k_{2}$ cannot be $n$ at the same time. One of them is $-C_{n\pm 1,n}$ at the $(9n-8,9n-8)$ entry. The last one is $C_{n\pm 1,n}$ at the $(9(n\pm 1)-8,8(n-1)+n)$ entry. We only need to sum the $C_{n+1,n}\Lambda_{n+1,n}$ terms when $n$ is odd and only need to sum the $C_{n-1,n}\Lambda_{n-1,n}$ terms when $n$ is even. The $(M,M)$ entry of the sum of all $C_{n\pm 1,n}\Lambda_{n\pm 1,n}$ terms is

\begin{cases}-(C_{n+1,n}+C_{l+1,l})/2,&n\neq l,n\text{ is odd}\\ -C_{n+1,n},&n=l,n\text{ is odd}\\ -(C_{n-1,n}+C_{l-1,l})/2,&n\neq l,n\text{ is even}\\ -C_{n-1,n},&n=l,n\text{ is even}\end{cases}

where $n=\lfloor\frac{M-1}{8}+1\rfloor$ and $l=(M-1)\mod 8+1$ .

Summing the $C_{n,n}\Lambda_{n,n}$ terms and the $C_{n\pm 1,n}\Lambda_{n\pm 1,n}$ terms of $\vv{L}$ leads to the claimed result.

Appendix B Proof of Theorem 2

Let $\beta=-i(1-\alpha)$ . The solution is in the form of $\vv{\rho}_{t}=\vv{\rho}(t)=\mathrm{exp}((\beta\vv{H}+\alpha\vv{L})t)\rho_{0}$ .

By Corollary 1,

\begin{array}[]{l}\exp((\beta\vv{H}+\alpha\vv{L})t)=\displaystyle\sum^{\infty}_{k=0}(\beta\vv{H}+\alpha\vv{L})^{k}t^{k}/k!\\[8.61108pt] \qquad=\ \displaystyle\sum^{\infty}_{k=0}A_{1}^{k}t^{k}\oplus A_{2}^{k}t^{k}\oplus A_{3}^{k}t^{k}\oplus A_{4}^{k}t^{k}/k!\\[12.91663pt] \qquad=\ \displaystyle\mathrm{exp}(A_{1}t)\oplus\mathrm{exp}(A_{2}t)\oplus\mathrm{exp}(A_{3}t)\oplus\mathrm{exp}(A_{4}t).\end{array}

(23)

By Equation (18) and with the indices defined in Equation (6),

Pr(C)=\displaystyle\vv{\rho}_{t}(1)+\vv{\rho}_{t}(19)+\vv{\rho}_{t}(37)+\vv{\rho}_{t}(55)

(24)

where $\vv{\rho}_{t}(i)$ is the ith element of vector $\vv{\rho}_{t}$ .

Calculating $\vv{\rho}_{t}(1)$ only needs the first row of $\mathrm{exp}(A_{1}t)$ .

We will calculate $\mathrm{exp}(A_{1}t)$ from $(A_{1}t)^{k}$ . Again only the first row of $(A_{1}t)^{k}$ is needed. Given $(A_{1}t)^{k}$ , to calculate the first row of $(A_{1}t)^{(k+1)}$ = $(A_{1}t)^{k}A_{1}t$ , still only the first row of $(A_{1}t)^{k}$ is needed. Let the first row of $(A_{1}t)^{k}$ be denoted as $v_{k}$ . Then $v_{k+1}=v_{k}A_{1}$ . Denote the jth element of $v_{k}$ as $v_{k,j}$ . Then by Corollary 1,

\begin{array}[]{l}\;\;\;\;[v_{k+1,2j-1}\;v_{k+1,2j}]\\[4.30554pt] =[v_{k,2j-1}\;v_{k,2j}]B_{1j}+[v_{k,2j+7}\;v_{k,2j+8}]F_{j},\end{array}

(25)

\begin{array}[]{l}\;\;\;\;[v_{k+1,2j+7}\;v_{k+1,2j+8}]\\[4.30554pt] =[v_{k,2j-1}\;v_{k,2j}]G_{j}+[v_{k,2j+7}\;v_{k,2j+8}]B_{1(j+4)},\end{array}

(26)

where $j=1,2,3,4$ ,

F_{1}=\begin{bmatrix}\beta&0\\ \alpha C_{2,1}&\beta\end{bmatrix},\;G_{1}=\begin{bmatrix}\beta&\alpha C_{1,2}\\ 0&\beta\end{bmatrix}

and $F_{j}=G_{j}=I_{2}$ for $j=2,3,4$ .

Combining Equation (25) and Equation (26), we have

\begin{array}[]{l}\;\;\;\;[v_{k+1,2j-1}\;v_{k+1,2j}\;v_{k+1,2j+7}\;v_{k+1,2j+8}]\\[4.30554pt] =[v_{k,2j-1}\;v_{k,2j}\;v_{k,2j+7}\;v_{k,2j+8}]\begin{bmatrix}B_{1j}&G_{j}\\ F_{j}&B_{1(j+4)}\end{bmatrix}.\end{array}

(27)

Use Equation (27) from 1 to $n$ , we have

\begin{array}[]{l}\;\;\;\;[v_{k,2j-1}\;v_{k,2j}\;v_{k,2j+7}\;v_{k,2j+8}]\\[4.30554pt] =[v_{1,2j-1}\;v_{1,2j}\;v_{1,2j+7}\;v_{1,2j+8}]\begin{bmatrix}B_{1j}&G_{j}\\ F_{j}&B_{1(j+4)}\end{bmatrix}^{k-1}.\end{array}

(28)

By Corollary 1,

v_{1}=\begin{bmatrix}-\alpha C_{2,1}&-\beta&0_{1\times 6}&\beta&\alpha C_{1,2}\end{bmatrix}

where $0_{1\times k}$ is a $1\times k$ column vector. Plugging the initial values in Equation (28), we have

\begin{array}[]{l}\;\;\;\;[v_{k,1}\;v_{k,2}\;v_{k,9}\;v_{k,10}]\\[4.30554pt] =[-\alpha C_{2,1}\;-\beta\;\beta\;\alpha C_{1,2}]\begin{bmatrix}B_{11}&G_{1}\\ F_{1}&B_{15}\end{bmatrix}^{k-1}.\end{array}

(29)

The other $v_{k,j}$ ’s with $j\neq 1,2,9,10$ are all zero.

Similarly, calculating $\vv{\rho}_{t}(19)$ , $\vv{\rho}_{t}(37)$ , and $\vv{\rho}_{t}(55)$ only needs the 3rd row of $\mathrm{exp}(A_{2}t)$ , the 5th row of $\mathrm{exp}(A_{3}t)$ , the 7th row of $\mathrm{exp}(A_{4}t)$ , respectively. Denote the 3rd row of $A_{2}^{k}$ , the 5th row of $A_{3}^{k}$ , and the 7th row of $A_{4}^{k}$ as $x_{k}$ , $y_{k}$ , and $z_{k}$ , respectively. Then $x_{k+1}=x_{k}A_{2}$ , $y_{k+1}=y_{k}A_{3}$ , and $z_{k+1}=z_{k}A_{4}$ . By Corollary 1,

x_{1}=\begin{bmatrix}0&0&-\alpha C_{4,3}&-\beta&0_{1\times 6}&\beta&\alpha C_{3,4}&0_{1\times 4}\end{bmatrix},

y_{1}=\begin{bmatrix}0_{1\times 4}&-\alpha C_{6,5}&-\beta&0_{1\times 6}&\beta&\alpha C_{5,6}&0&0\end{bmatrix},

z_{1}=\begin{bmatrix}0_{1\times 6}&-\alpha C_{8,7}&-\beta&0_{1\times 6}&\beta&\alpha C_{7,8}\end{bmatrix}.

Denote the jth elements of $x_{k}$ , $y_{k}$ , and $z_{k}$ as $x_{k,j}$ , $y_{k,j}$ , and $z_{k,j}$ , respectively. Then the nonzero elements can be calculated as

\begin{array}[]{l}\;\;\;\;[x_{k,3}\;x_{k,4}\;x_{k,11}\;x_{k,12}]\\[4.30554pt] =[-\alpha C_{4,3}\;-\beta\;\beta\;\alpha C_{3,4}]\begin{bmatrix}B_{22}&G_{2}\\ F_{2}&B_{26}\end{bmatrix}^{k-1},\end{array}

(30)

\begin{array}[]{l}\;\;\;\;[y_{k,5}\;y_{k,6}\;y_{k,13}\;y_{k,14}]\\[4.30554pt] =[-\alpha C_{6,5}\;-\beta\;\beta\;\alpha C_{5,6}]\begin{bmatrix}B_{33}&G_{3}\\ F_{3}&B_{37}\end{bmatrix}^{k-1},\end{array}

(31)

\begin{array}[]{l}\;\;\;\;[z_{k,7}\;z_{k,8}\;z_{k,15}\;z_{k,16}]\\[4.30554pt] =[-\alpha C_{8,7}\;-\beta\;\beta\;\alpha C_{7,8}]\begin{bmatrix}B_{44}&G_{4}\\ F_{4}&B_{48}\end{bmatrix}^{k-1},\end{array}

(32)

In Equation (29),

\begin{bmatrix}B_{11}&G_{1}\\ F_{1}&B_{15}\end{bmatrix}=\begin{bmatrix}-\alpha C_{2,1}&-\beta&\beta&\alpha C_{1,2}\\ -\beta&b_{1}&0&\beta\\ \beta&0&b_{5}&-\beta\\ \alpha C_{2,1}&\beta&-\beta&-\alpha C_{1,2}\end{bmatrix}

(33)

where

b_{1}=-\frac{\alpha}{2}(C_{1,1}+C_{2,2}+C_{2,1}+C_{1,2})=-\alpha

and

b_{5}=-\frac{\alpha}{2}(C_{2,2}+C_{1,1}+C_{1,2}+C_{2,1})=-\alpha.

Thus it can be diagonalized as

\begin{bmatrix}B_{11}&G_{1}\\ F_{1}&B_{15}\end{bmatrix}=P\begin{bmatrix}-\alpha-2\beta&&&\\ &-\alpha+2\beta&&\\ &&-\alpha&\\ &&&0\end{bmatrix}P^{-1}

(34)

where

P=\begin{bmatrix}1&1&0&\alpha^{2}C_{1,2}-2\beta^{2}\\ 1&-1&1&-\alpha\beta(C_{1,2}-C_{2,1})\\ -1&1&1&\alpha\beta(C_{1,2}-C_{2,1})\\ -1&-1&0&\alpha^{2}C_{2,1}-2\beta^{2}\end{bmatrix}

and

P^{-1}=\begin{bmatrix}\frac{1}{2}-\frac{b}{c}&\frac{1}{4}&-\frac{1}{4}&-\frac{b}{c}\\ \frac{1}{2}-\frac{a}{c}&-\frac{1}{4}&\frac{1}{4}&-\frac{a}{c}\\ 0&\frac{1}{2}&\frac{1}{2}&0\\ \frac{1}{c}&0&0&\frac{1}{c}\end{bmatrix}

with

a=\frac{\alpha\beta}{2}(C_{1,2}-C_{2,1})+\frac{\alpha^{2}}{2}C_{1,2}-\beta^{2}

b=-\frac{\alpha\beta}{2}(C_{1,2}-C_{2,1})+\frac{\alpha^{2}}{2}C_{1,2}-\beta^{2},

and $c=\alpha^{2}-4\beta^{2}$ . Thus

\exp(\begin{bmatrix}B_{11}&G_{1}\\ F_{1}&B_{15}\end{bmatrix}t)=Pe^{-\alpha t}\begin{bmatrix}e^{-2\beta t}&&&\\ &e^{2\beta t}&&\\ &&1&\\ &&&e^{\alpha t}\end{bmatrix}P^{-1}

(35)

The nonzero elements (the 1st, 2nd, 9th, 10th elements, denoted as $d_{1}$ , $d_{2}$ , $d_{9}$ , $d_{10}$ ) of the 1st row of $\exp(A_{1}t)$ are the 1st row of $\exp(\begin{bmatrix}B_{11}&G_{1}\\ F_{1}&B_{15}\end{bmatrix}t)$ . The nonzero elements (the 3rd, 4th, 11th, 12th elements, denoted as $e_{3}$ , $e_{4}$ , $e_{11}$ , $e_{12}$ ) of the 3rd row of $\exp(A_{2}t)$ are the 1st row of $\exp(\begin{bmatrix}B_{22}&G_{2}\\ F_{2}&B_{26}\end{bmatrix}t)$ . The nonzero elements (the 5th, 6th, 13th, 14th elements, denoted as $f_{5}$ , $f_{6}$ , $f_{13}$ , $f_{14}$ ) of the 5th row of $\exp(A_{3}t)$ are the 1st row of $\exp(\begin{bmatrix}B_{33}&G_{3}\\ F_{3}&B_{37}\end{bmatrix}t)$ . The nonzero elements (the 7th, 8th, 15th, 16th elements, denoted as $g_{7}$ , $g_{8}$ , $g_{15}$ , $g_{16}$ ) of the 5th row of $\exp(A_{4}t)$ are the 1st row of $\exp(\begin{bmatrix}B_{44}&G_{4}\\ F_{4}&B_{48}\end{bmatrix}t)$ . Thus

\begin{array}[]{lcl}\vv{\rho}_{t}(1)&=&\vv{\rho}_{0}(1)d_{1}+\vv{\rho}_{0}(2)d_{2}+\vv{\rho}_{0}(9)d_{9}+\vv{\rho}_{0}(10)d_{10}\\[8.61108pt] &=&\rho_{0}(1,1)d_{1}+\rho_{0}(2,1)d_{2}+\rho_{0}(1,2)d_{9}+\rho_{0}(2,2)d_{10}\\[8.61108pt] &=&\rho_{0}(1,1)d_{1}+\rho_{0}(2,2)d_{10}\\[8.61108pt] &=&\displaystyle p\left(\frac{1-p_{A}}{2}\right)d_{1}+p\left(\frac{1-p_{A}}{2}\right)d_{10}\\[8.61108pt] &=&\displaystyle p(1-p_{A})\left[\frac{1}{4}e^{-\alpha t}(e^{-2\beta t}+e^{2\beta t})+d_{10}\right]\end{array}

(36)

where $d_{10}=\frac{\alpha^{2}C_{1,2}-2\beta^{2}}{c}-\frac{e^{-\alpha t}}{c}(ae^{2\beta t}+be^{-2\beta t})$ . Similarly, we have

\vv{\rho}_{t}(19)=p\cdot p_{A}\left[\frac{1}{4}e^{-\alpha t}(e^{-2\beta t}+e^{2\beta t})+e_{12}\right]

(37)

\vv{\rho}_{t}(37)=(1-p)(1-p_{A})\left[\frac{1}{4}e^{-\alpha t}(e^{-2\beta t}+e^{2\beta t})+f_{14}\right]

(38)

\vv{\rho}_{t}(55)=p\cdot p_{A}\left[\frac{1}{4}e^{-\alpha t}(e^{-2\beta t}+e^{2\beta t})+g_{16}\right]

(39)

where

e_{12}=\frac{\alpha^{2}C_{3,4}-2\beta^{2}}{c}-\frac{e^{-\alpha t}}{c}(ae^{2\beta t}+be^{-2\beta t}),

f_{14}=\frac{\alpha^{2}C_{5,6}-2\beta^{2}}{c}-\frac{e^{-\alpha t}}{c}(ae^{2\beta t}+be^{-2\beta t}),

g_{16}=\frac{\alpha^{2}C_{7,8}-2\beta^{2}}{c}-\frac{e^{-\alpha t}}{c}(ae^{2\beta t}+be^{-2\beta t}).

Summing up $\vv{\rho}_{t}(1)$ , $\vv{\rho}_{t}(19)$ , $\vv{\rho}_{t}(37)$ , and $\vv{\rho}_{t}(55)$ leads to the claimed $Pr(C)$ .