This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Graduate School of Informatics, Nagoya University
Furo-cho, Chikusa, Nagoya 464-8601, Japan
11email: {rindo,seki}@sqlab.jp
22institutetext: School of Informatics, Kochi University of Technology
Tosayamada, Kami City, Kochi 782-8502, Japan
22email: takata.yoshiaki@kochi-tech.ac.jp

Verification with Common Knowledge of Rationality for Graph Games

Rindo Nakanishi 11    Yoshiaki Takata 22    Hiroyuki Seki 11
Abstract

Realizability asks whether there exists a program satisfying its specification. In this problem, we assume that each agent has her own objective and behaves rationally to satisfy her objective. Traditionally, the rationality of agents is modeled by a Nash equilibrium (NE), where each agent has no incentive to change her strategy because she cannot satisfy her objective by changing her strategy alone. However, an NE is not always an appropriate notion for the rationality of agents because the condition of an NE is too strong; each agent is assumed to know strategies of the other agents completely. In this paper, we use an epistemic model to define common knowledge of rationality of all agents (CKR). We define the verification problem as a variant of the realizability problem, based on CKR, instead of NE. We then analyze the complexity of the verification problems for the class of positional strategies.

Keywords:
graph game, epistemic model, common knowledge of rationality

1 Introduction

A graph game is a formal model for analyzing or controlling a system consisting of multiple agents (or processes) that behave independently according to their own preferences or objectives. One of the useful applications of graph game is reactive synthesis, which is the problem of synthesizing a reactive system that satisfies a given specification. The standard approach to the problem is as follows [1]. When a specification is given by a linear temporal logic (LTL) formula (or nondeterministic ω\omega-automaton) φ\varphi, we translate φ\varphi to an equivalent deterministic ω\omega-automaton 𝒜{\cal A}. Next, we convert 𝒜{\cal A} to a tree automaton (or equivalently, a parity game) {\cal B}. Then, we test whether the language recognized by \mathcal{B} is empty, i.e., L()=L(\mathcal{B})=\varnothing (or equivalently, there is a winning strategy for player 0, which is the system player in \mathcal{B}). The answer to the problem is affirmative if and only if L()L({\cal B})\not=\varnothing, and any tL()t\in L({\cal B}) (or any winning strategy for the system player in {\cal B}) is an implementation of the specification.

As described above, reactive synthesis can be viewed as a two-player zero-sum game, in which the system player aims at satisfying the specification as her goal (or winning objective) whereas the objective of the environment player is the negation of the specification. If there is a winning strategy for the system player, then any one of them is an implementation satisfying the specification. However, the assumption that the objective of the environment is antagonistic to the system’s objective is too conservative; usually, the environment behaves based on its own preference or interest. Also, the environment often consists of multiple agents, and hence a multi-player non-zero-sum game is a more appropriate model than a two-player zero-sum game. Furthermore, it is natural to require that the system should satisfy the specification under the assumption that all players behave rationally, i.e., they behave aiming to satisfy their own objectives.

Rational synthesis (abbreviated as RS) asks whether a given specification is satisfied whenever all the players behave rationally. Rational verification (abbreviated as RV) is the problem asking whether every rational strategy profile satisfies a specification. RV is defined by adding the rationality assumption on the usual model checking, which asks whether every execution of a given model satisfies a specification [2]. Note that RV and RS are closely related. The answer of RV for a specification ψ\psi is no if and only if the answer of RS for ¬ψ\neg\psi is yes. Namely, RV fails for ψ\psi iff there is a counter-example to ψ\psi (a rational behavior that satisfies ¬ψ\neg\psi). As described in the related work section below, rationality is traditionally captured by Nash equilibrium, which is one of the most important concepts in game theory. We say that a tuple of strategies of all players (called a strategy profile) is a Nash equilibrium (abbreviated as NE) when no one can improve her own payoff, which is the reward she receives from the game, by changing her strategy alone. An NE locally maximizes each player’s payoff and hence each player has no incentive to change her strategy. From the viewpoint of epistemic game theory, however, NE is not always suitable for the concept of rationality because each player is assumed to know the strategies of the other players. (Also see the related work below.)

Epistemic game theory [3] uses a Kripke frame that consists of a set of worlds (or states) WW and a subset Rp(w)WR_{p}(w)\subseteq W for each player pp and a world ww. For a world ww and a player pp, Rp(w)R_{p}(w) represents the set of possible worlds from the viewpoint of pp in the actual world ww of pp. For instance, if Rp(w)={w,w,w′′}R_{p}(w)=\{w,w^{\prime},w^{\prime\prime}\}, it means that the information given to pp in ww is incomplete and pp cannot distinguish ww from ww^{\prime} or w′′w^{\prime\prime}. Possible world is useful for modelling a situation such that each player (or process) cannot know the internal states of the other players (e.g., the contents of local variables of the other processes). An epistemic model is a pair of a Kripke frame and a mapping that associates a strategy profile with each world. We say that a player pp is rational in a world ww, if for every possible world ww^{\prime} of pp in ww, there is no better strategy of pp than the one associated with ww. Then, a strategy profile is said to be epistemically rational if there exist an epistemic model MM and a world ww in MM such that every player is rational in ww and this property is a common knowledge among all players. (The formal definition of common knowledge is postponed to the next section.)

In this paper, we propose a new framework for reactive synthesis and verification, by augmenting graph game with epistemic models. We then define the rational verification problem based on the proposed model and present some results on the complexity of the problem.

Related work

Studies on reactive synthesis has its origin in 1960s and has been one of central topics in formal methods as well as model checking. The problem is EXPTIME-complete when a specification is given by an ω\omega-automaton [4] and 2EXPTIME-complete when a specification is given by an LTL formula [5]. Rational synthesis (RS) is in 2EXPTIME [6] when a specification and objectives (in what follows, we refer to as objectives only) are given as LTL formulas. It is PSPACE-complete when objectives of players are restricted to GR(1) [7]. The complexity of RS is also studied for ω\omega-regular objectives. RS is PTIME-complete with Büchi objectives, NP-complete with coBüchi, parity, Streett objectives [8] and PSPACE-complete with Muller objectives [9]. RS has been applied to the synthesis of non-repudiation and fair exchange protocols [10, 11]. RS is optimistic in the sense that the system first proposes a strategy profile and all environment players will follow it as far as they do not have profitable deviations. For this reason, another type of RS was proposed, called non-cooperative rational synthesis (NCRS) in [12]. NCRS asks whether there is a strategy s0s_{0} of the system such that every 0-fixed NE (a strategy profile where no environment player has a profitable deviation) including s0s_{0} satisfies the specification. Decidability and complexity of NCRS have also been studied [9, 12, 13].

Bruyère, et al. [14] investigated the complexity of rational verification (RV) taking Pareto-optimality as the notion of rationality, and show that RV is coNP-complete, Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}-complete and PSPACE-complete with parity, Boolean Büchi and LTL objectives, respectively. Brice, et al. [15] considered weighted (or duration) games, adopted NE and subgame-perfect equilibrium as the notions of rationality and showed that RV is coNP-complete with mean payoff objectives and undecidable with energy objectives.

Epistemic rationality does not always imply NE. In [16], Aumann and Brandenburger show an epistemic sufficient condition for NE in terms of strategic form game (not graph game). (Also see [17].)

As described above, there are already many studies on the decidability and complexity of RS and RV. However, all of them take NE or its refinement as criteria of rationality. This paper is the first step for defining and analyzing RV where the rationality is defined in an epistemic way. Also, we think that combining an epistemic model with a usual graph game enables us to express incomplete information of a player in a natural way.

Outline

In Sections 2 and 3, we review graph game and epistemic model. In Section 4, we define rational verification problems 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S}, 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} and 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S}. The problem 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S} asks whether all strategy profiles over the class SS of strategies satisfy a given specification when CKR holds. The problem 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} is a variant of 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S}, where the size of an epistemic model is not greater than a polynomial size of a game arena. The problem 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S} asks whether all NE over the class SS of strategies satisfy a given specification. Table 1 shows the complexities of these problems.

Table 1: The complexities of verification problems
𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}} 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} 𝑉𝑃𝑁𝑎𝑠ℎ𝖲𝗍𝗋\mathit{VPNash}_{\mathsf{Str}}
Upper bound coNEXPNP\text{co}\mathrm{NEXP}^{\mathrm{NP}} Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2} Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}-complete PSPACE\mathrm{PSPACE}-complete [9]
Lower bound Σ2𝖯\mathrm{\Sigma}^{\mathsf{P}}_{2}-hard coNP\text{co}\mathrm{NP}-hard

𝖲𝗍𝗋\mathsf{Str} is the class of all strategies and 𝖯𝗈𝗌\mathsf{Pos} is the class of all positional strategies. In Section 5, we summarize the paper and give future work.

2 Graph Game

In this section, we provide basic definitions and notions on graph games, which are needed to present our new framework. We start with the definition of game arena, winning objective and strategy and so on, followed by the definition of Nash equilibrium (NE).

A graph game is a directed graph with an initial vertex. Each vertex is controlled by a player who chooses next vertex. A game is started from the initial vertex. Then, players repeatedly choose the next vertex according to their strategies. An infinite sequence of vertices generated by such a process is called a play. If the play satisfies the winning objective of a player, then she wins. Otherwise, she loses. Note that our setting is non-zero-sum, hence it is possible that there are multiple winners. An NE is a tuple of strategies of all players where each loser cannot become a winner by changing her strategy alone.

For a binary relation RX×XR\subseteq X\times X over a set XX and a subset AXA\subseteq X, we define R(A)={xX(a,x)RaA}XR(A)=\{x\in X\mid(a,x)\in R\wedge a\in A\}\subseteq X.

Game arena

Definition 1

A game arena is a tuple G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta), where

  • PP is a finite set of players,

  • VV is a finite set of vertices,

  • (Vp)pP(V_{p})_{p\in P} is a partition of VV, namely, ViVj=V_{i}\cap V_{j}=\varnothing for all ij(i,jP)i\neq j\ (i,j\in P) and pPVp=V\bigcup_{p\in P}V_{p}=V,

  • v0Vv_{0}\in V is the initial vertex, and

  • ΔV×V\Delta\subseteq V\times V is a set of edges where Δ(v)\Delta(v)\neq\varnothing for all vVv\in V.

Play and history

An infinite sequence of vertices v0v1v2(viV,i0)v_{0}v_{1}v_{2}\cdots\ (v_{i}\in V,i\geq 0) starting from the initial vertex v0v_{0} is a play if (vi,vi+1)Δ(v_{i},v_{i+1})\in\Delta for all i0i\geq 0. A history is a non-empty (finite) prefix of a play. The set of all plays is denoted by 𝑃𝑙𝑎𝑦G\mathit{Play}_{G} and the set of all histories is denoted by 𝐻𝑖𝑠𝑡G\mathit{Hist}_{G}. We often write a history as hvhv where h𝐻𝑖𝑠𝑡{ε}h\in\mathit{Hist}\cup\{\varepsilon\} and vVv\in V. For a player pPp\in P, let 𝐻𝑖𝑠𝑡G,p={hv𝐻𝑖𝑠𝑡vVp}\mathit{Hist}_{G,p}=\{hv\in\mathit{Hist}\mid v\in V_{p}\}. That is, 𝐻𝑖𝑠𝑡G,p\mathit{Hist}_{G,p} is the set of histories ending with a vertex controlled by player pp. We abbreviate 𝑃𝑙𝑎𝑦G\mathit{Play}_{G}, 𝐻𝑖𝑠𝑡G,p\mathit{Hist}_{G,p} and 𝐻𝑖𝑠𝑡G\mathit{Hist}_{G} as 𝑃𝑙𝑎𝑦\mathit{Play}, 𝐻𝑖𝑠𝑡p\mathit{Hist}_{p} and 𝐻𝑖𝑠𝑡\mathit{Hist} respectively, if GG is clear from the context. For a play ρ=v0v1v2𝑃𝑙𝑎𝑦\rho=v_{0}v_{1}v_{2}\cdots\in\mathit{Play}, we define 𝐼𝑛𝑓(ρ)={vVi0.ji.vj=v}\mathit{Inf}(\rho)=\{v\in V\mid\forall i\geq 0.\ \exists j\geq i.\ v_{j}=v\}.

Strategy

For a player pPp\in P, a strategy of pp is a function sp:𝐻𝑖𝑠𝑡pVs_{p}:\mathit{Hist}_{p}\to V such that (v,sp(hv))Δ(v,s_{p}(hv))\in\Delta for all hv𝐻𝑖𝑠𝑡phv\in\mathit{Hist}_{p}. At a vertex vVpv\in V_{p}, player pp chooses sp(hv)s_{p}(hv) as the next vertex according to her strategy sps_{p}. Note that because the domain of sps_{p} is 𝐻𝑖𝑠𝑡p\mathit{Hist}_{p}, the next vertex may depend on the whole history in general. Let 𝖲𝗍𝗋G,p\mathsf{Str}_{G,p} denote the set of all strategies of pp.

When a pp’s strategy sp𝖲𝗍𝗋G,ps_{p}\in\mathsf{Str}_{G,p} satisfies sp(hv)=sp(hv)s_{p}(hv)=s_{p}(h^{\prime}v) for all hv,hv𝐻𝑖𝑠𝑡phv,h^{\prime}v\in\mathit{Hist}_{p}, we say that sps_{p} is positional because the next vertex depends only on the current vertex vv. We regard a function sp:VpΔ(V)s_{p}:V_{p}\to\Delta(V) as a pp’s positional strategy where sp(hv)=sp(v)s_{p}(hv)=s_{p}(v) for all hv𝐻𝑖𝑠𝑡phv\in\mathit{Hist}_{p}. Let 𝖯𝗈𝗌G,p𝖲𝗍𝗋G,p\mathsf{Pos}_{G,p}\subseteq\mathsf{Str}_{G,p} denote the set of all positional strategies of pp.

We abbreviate 𝖲𝗍𝗋G,p\mathsf{Str}_{G,p} and 𝖯𝗈𝗌G,p\mathsf{Pos}_{G,p} as 𝖲𝗍𝗋p\mathsf{Str}_{p} and 𝖯𝗈𝗌p\mathsf{Pos}_{p} respectively, if GG is clear from the context.

Strategy profile

A strategy profile is a tuple 𝒔=(sp)pP\bm{s}=(s_{p})_{p\in P} of strategies of all players, namely sp𝖲𝗍𝗋ps_{p}\in\mathsf{Str}_{p} for all pPp\in P. Let 𝖲𝗍𝗋G\mathsf{Str}_{G} (resp. 𝖯𝗈𝗌G\mathsf{Pos}_{G}) be the set of all strategy profiles (resp. the set of all strategy profiles ranging over positional strategies). We define the function outG:𝖲𝗍𝗋G𝑃𝑙𝑎𝑦\mathrm{out}_{G}:\mathsf{Str}_{G}\to\mathit{Play} as outG((sp)pP)=v0v1v2\mathrm{out}_{G}((s_{p})_{p\in P})=v_{0}v_{1}v_{2}\cdots where vi+1=sp(v0vi)v_{i+1}=s_{p}(v_{0}\cdots v_{i}) for all i0i\geq 0 and for pPp\in P with viVpv_{i}\in V_{p}. We call the play outG(𝒔)\mathrm{out}_{G}(\bm{s}) the outcome of 𝒔\bm{s}. We abbreviate 𝖲𝗍𝗋G\mathsf{Str}_{G}, 𝖯𝗈𝗌G\mathsf{Pos}_{G} and outG\mathrm{out}_{G} as 𝖲𝗍𝗋\mathsf{Str}, 𝖯𝗈𝗌\mathsf{Pos} and out\mathrm{out} respectively, if GG is clear from the context. For a strategy profile 𝒔𝖲𝗍𝗋G\bm{s}\in\mathsf{Str}_{G} and a strategy sp𝖲𝗍𝗋ps^{\prime}_{p}\in\mathsf{Str}_{p} of a player pPp\in P, let 𝒔[psp]𝖲𝗍𝗋G\bm{s}[p\mapsto s^{\prime}_{p}]\in\mathsf{Str}_{G} denote the strategy profile obtained from 𝒔\bm{s} by replacing the strategy of pp in 𝒔\bm{s} with sps^{\prime}_{p}.

Objective

We assume that the result a player obtains from a play is either a winning or a losing. Each player has her own winning condition over plays, and we represent a winning condition by a subset O𝑃𝑙𝑎𝑦O\subseteq\mathit{Play} of plays; i.e., the player wins if and only if the play belongs to the subset OO. We call the subset OO the objective of that player. In this paper, we focus on the following important classes of objectives.

Definition 2

Let UVU\subseteq V be a subset of vertices and φ\varphi be a Boolean formula whose variables are the vertices of VV. We will use UU and φ\varphi as finite representations for specifying an objective as follows.

  • Büchi objective:
    Bu¨chi(U)={ρ𝑃𝑙𝑎𝑦𝐼𝑛𝑓(ρ)U}\mathop{\mathrm{B\ddot{u}chi}}\nolimits(U)=\{\rho\in\mathit{Play}\mid\mathit{Inf}(\rho)\cap U\neq\varnothing\}.

  • Muller objective:
    Muller(φ)={ρ𝑃𝑙𝑎𝑦φ is 𝑡𝑟𝑢𝑒 under θρ}\mathop{\mathrm{Muller}}\nolimits(\varphi)=\{\rho\in\mathit{Play}\mid\varphi\text{ is }\mathit{true}\text{ under }\theta_{\rho}\} where θρ\theta_{\rho} is the truth assignment defined as θρ(v)=𝑡𝑟𝑢𝑒\theta_{\rho}(v)=\mathit{true} iff v𝐼𝑛𝑓(ρ)v\in\mathit{Inf}(\rho).

Note that a Bu¨chi\mathop{\mathrm{B\ddot{u}chi}}\nolimits objective is also a Muller objective: For any UVU\subseteq V, it holds that Bu¨chi(U)=Muller(uUu)\mathop{\mathrm{B\ddot{u}chi}}\nolimits(U)=\mathop{\mathrm{Muller}}\nolimits(\bigvee_{u\in U}u).

Objective profile

An objective profile is a tuple 𝜶=(Op)pP\bm{\alpha}=(O_{p})_{p\in P} of objectives of all players, namely Op𝑃𝑙𝑎𝑦O_{p}\subseteq\mathit{Play} for all pPp\in P. For a strategy profile 𝒔𝖲𝗍𝗋\bm{s}\in\mathsf{Str} and an objective profile 𝜶=(Op)pP\bm{\alpha}=(O_{p})_{p\in P}, we define the set WinG(𝜶,𝒔)P\mathop{\mathrm{Win}}\nolimits_{G}(\bm{\alpha},\bm{s})\subseteq P of winners as WinG(𝜶,𝒔)={pPoutG(𝒔)Op}\mathop{\mathrm{Win}}\nolimits_{G}(\bm{\alpha},\bm{s})=\{p\in P\mid\mathrm{out}_{G}(\bm{s})\in O_{p}\}. That is, a player pp is a winner if and only if outG(𝒔)\mathrm{out}_{G}(\bm{s}) belongs to the objective OpO_{p} of pp. If pWinG(𝜶,𝒔)p\in\mathop{\mathrm{Win}}\nolimits_{G}(\bm{\alpha},\bm{s}), we also say that pp wins for GG and 𝜶\bm{\alpha} (by the strategy profile 𝒔\bm{s}). Note that it is possible that no player wins the game or all the players win the game. In this sense, a game is non-zero-sum. If an objective profile 𝜶=(Op)pP\bm{\alpha}=(O_{p})_{p\in P} is a partition of 𝑃𝑙𝑎𝑦\mathit{Play}, i.e., OiOj=O_{i}\cap O_{j}=\varnothing for all ij(i,jP)i\neq j\ (i,j\in P) and pPOp=𝑃𝑙𝑎𝑦\bigcup_{p\in P}O_{p}=\mathit{Play}, then the game is called zero-sum. When a game is zero-sum, there is one and only one winner and the other players are all losers. We abbreviate WinG\mathop{\mathrm{Win}}\nolimits_{G} as Win\mathop{\mathrm{Win}}\nolimits if GG is clear from the context.

Winning strategy

Let S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. When the objective of player pp is OpO_{p}, a strategy sSps\in S_{p} is called a winning strategy of pp if it holds that out(𝒔[ps])Op\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[p\mapsto s])\in O_{p} for every strategy profile 𝒔S{\bm{s}}\in S. That is, ss is a winning strategy of pp if pp always wins by taking ss regardless of the strategies of the other players.

Nash equilibrium

Let 𝜶=(Op)pP\bm{\alpha}=(O_{p})_{p\in P} be an objective profile and S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. A strategy profile 𝒔S\bm{s}\in S is called a Nash equilibrium (NE) for 𝜶\bm{\alpha} and SS if it holds that pP.spSp.pWin(𝜶,𝒔[psp])pWin(𝜶,𝒔)\forall p\in P.\ \forall s_{p}\in S_{p}.\ p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s}[p\mapsto s_{p}])\implies p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s}). Intuitively, 𝒔\bm{s} is an NE if any player pp cannot improve the result (from losing to winning) by changing her strategy alone. Because pWin(𝜶,𝒔)p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s}) is equivalent to out(𝒔)Op\mathrm{out}(\bm{s})\in O_{p}, a strategy profile 𝒔S\bm{s}\in S is an NE for 𝜶\bm{\alpha} and SS if and only if pP.spSp.out(𝒔[psp])Opout(𝒔)Op\forall p\in P.\ \forall s_{p}\in S_{p}.\ \mathrm{out}(\bm{s}[p\mapsto s_{p}])\in O_{p}\implies\mathrm{out}(\bm{s})\in O_{p}. We write this condition as 𝑁𝑎𝑠ℎ(𝒔,𝜶,S)\mathop{\mathit{Nash}}\nolimits(\bm{s},\bm{\alpha},S).

Example 1

Figure 1 shows a 33-player game arena G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta) where P={0,1,2}P=\{0,1,2\}, V={v0,v1,v2}V=\{v_{0},v_{1},v_{2}\}, Vp={vp}(pP)V_{p}=\{v_{p}\}\ (p\in P) and Δ={(vi,vj)i,jP,ij}\Delta=\{(v_{i},v_{j})\mid i,j\in P,\ i\neq j\}.

v0v_{0}v1v_{1}v2v_{2}
Figure 1: 33-player game arena with Büchi objectives

The objective of player pp is Op=Bu¨chi({v(p+1)mod3})O_{p}=\mathop{\mathrm{B\ddot{u}chi}}\nolimits(\{v_{(p+1)\bmod 3}\}), namely to visit the vertex v(p+1)mod3v_{(p+1)\bmod 3} infinitely often. The objective profile is 𝜶=(Op)pP\bm{\alpha}=(O_{p})_{p\in P}. Let 𝒔=(sp)pP𝖯𝗈𝗌\bm{s}=(s_{p})_{p\in P}\in\mathsf{Pos} be the strategy profile over positional strategies where sp(h)=v(p+1)mod3s_{p}(h)=v_{(p+1)\bmod 3} for all h𝐻𝑖𝑠𝑡ph\in\mathit{Hist}_{p}. Let 𝒔=(sp)pP𝖯𝗈𝗌\bm{s^{\prime}}=(s^{\prime}_{p})_{p\in P}\in\mathsf{Pos} be the strategy profile over positional strategies where s0(h0)=v1s^{\prime}_{0}(h_{0})=v_{1} and s1(h1)=s2(h2)=v0s^{\prime}_{1}(h_{1})=s^{\prime}_{2}(h_{2})=v_{0} for all hp𝐻𝑖𝑠𝑡p(pP)h_{p}\in\mathit{Hist}_{p}\ (p\in P). It holds that out(𝒔)=(v0v1v2)ωOp\mathrm{out}(\bm{s})=(v_{0}v_{1}v_{2})^{\omega}\in O_{p} for all pp. Hence, Win(𝜶,𝒔)={0,1,2}\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s})=\{0,1,2\}. On the other hand, it holds that out(𝒔)=(v0v1)ωO0O2\mathrm{out}(\bm{s^{\prime}})=(v_{0}v_{1})^{\omega}\in O_{0}\cap O_{2} and out(𝒔)O1\mathrm{out}(\bm{s^{\prime}})\notin O_{1}. Hence, Win(𝜶,𝒔)={0,2}\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s^{\prime}})=\{0,2\}. The strategy profile 𝒔\bm{s} is an NE for 𝜶\bm{\alpha}. The strategy profile 𝒔\bm{s^{\prime}} is not an NE for 𝜶\bm{\alpha} because there is a positional strategy s1𝖯𝗈𝗌1s_{1}\in\mathsf{Pos}_{1} of player 11 such that 1Win(𝜶,𝒔[1s1])1\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{s^{\prime}}[1\mapsto s_{1}]).

The following problem asks if a given strategy profile 𝒔\bm{s} satisfies a specification OO. We use Lemma 1 to prove the upper bounds of the complexities of verification problems in Section 4.

Problem 1

Let GG be a game arena, O𝑃𝑙𝑎𝑦O\subseteq\mathit{Play} be a Muller objective and S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. We define the simple verification problem as follows.

𝑠𝑉𝑃S={G,𝒔,O𝒔Sout(𝒔)O}.\mathit{sVP}_{S}=\{\langle G,\bm{s},O\rangle\mid\bm{s}\in S\wedge\mathrm{out}(\bm{s})\in O\}.
Lemma 1

𝑠𝑉𝑃𝖯𝗈𝗌\mathit{sVP}_{\mathsf{Pos}} is in PTIME\mathrm{PTIME}.

Proof

Let φ\varphi be a given Boolean formula representing a Muller objective OO (i.e. O=Muller(φ)O=\mathop{\mathrm{Muller}}\nolimits(\varphi)). Because 𝒔𝖯𝗈𝗌\bm{s}\in\mathsf{Pos} is a strategy profile over positional strategies, the play out(𝒔)\mathrm{out}(\bm{s}) can be written as out(𝒔)=u0uω\mathrm{out}(\bm{s})=u_{0}u^{\omega} for some u0Vu_{0}\in V^{*} and uV+u\in V^{+} such that u0uu_{0}u does not contain any vertex twice. Vertices in uu are visited infinitely often and vertices not in uu are visited only finite times, and thus we can construct the truth assignment θout(𝒔)\theta_{\mathrm{out}(\bm{s})} such that θout(𝒔)(v)=𝑡𝑟𝑢𝑒\theta_{\mathrm{out}(\bm{s})}(v)=\mathit{true} iff v𝐼𝑛𝑓(out(𝒔))v\in\mathit{Inf}(\mathrm{out}(\bm{s})) in polynomial time. Simply evaluating φ\varphi under θout(𝒔)\theta_{\mathrm{out}(\bm{s})}, we can check whether the play satisfies the Muller objective OO. ∎

3 Epistemic Model

In this section, we first review Kripke frame and epistemic model together with the important notion: (epistemic) rationality and common knowledge of rationality and give simple examples. We then propose a new characterization of the notion of common knowledge of rationality based on graph games.

KT5 Kripke frame

Definition 3

A KT5 Kripke frame is a pair (W,(Rp)pP)(W,(R_{p})_{p\in P}), where

  • PP is a finite set of players,

  • WW is a finite set of (possible) worlds, and

  • RpW×WR_{p}\subseteq W\times W is an equivalence relation on WW, namely, RpR_{p} satisfies

    (reflexivity)

    wW.(w,w)Rp\forall w\in W.\ (w,w)\in R_{p},

    (symmetry)

    w,wW.((w,w)Rp(w,w)Rp)\forall w,w^{\prime}\in W.\ \left((w,w^{\prime})\in R_{p}\implies(w^{\prime},w)\in R_{p}\right), and

    (transitivity)

    w1,w2,w3W.((w1,w2),(w2,w3)Rp(w1,w3)Rp)\forall w_{1},w_{2},w_{3}\in W.\ \left((w_{1},w_{2}),(w_{2},w_{3})\in R_{p}\implies(w_{1},w_{3})\in R_{p}\right).

A Kripke frame expresses the structure of knowledge of players. In the world ww, Player pp only knows that she is in one of the worlds of Rp(w)R_{p}(w). In other words, in the world ww, Player pp cannot distinguish the worlds of Rp(w)R_{p}(w) with one another.

Knowledge operator

For a given KT5 Kripke frame (W,(Rp)pP)(W,(R_{p})_{p\in P}), we call any subset EWE\subseteq W an event.

Definition 4

Let (W,(Rp)pP)(W,(R_{p})_{p\in P}) be a KT5 Kripke frame and pPp\in P be a player. The knowledge operator 𝒦p:2W2W\mathop{\mathcal{K}}\nolimits_{p}:2^{W}\to 2^{W}, the mutual knowledge operator 𝒦:2W2W\mathop{\mathcal{MK}}\nolimits:2^{W}\to 2^{W} and the common knowledge operator 𝒞𝒦:2W2W\mathop{\mathcal{CK}}\nolimits:2^{W}\to 2^{W} are defined as follows.

𝒦p(E)\displaystyle\mathop{\mathcal{K}}\nolimits_{p}(E) ={wWRp(w)E},\displaystyle=\{w\in W\mid R_{p}(w)\subseteq E\},
𝒦(E)\displaystyle\mathop{\mathcal{MK}}\nolimits(E) =pP𝒦p(E),and\displaystyle=\bigcap_{p\in P}\mathop{\mathcal{K}}\nolimits_{p}(E),\ \text{and}
𝒞𝒦(E)\displaystyle\mathop{\mathcal{CK}}\nolimits(E) =1i𝒦i(E),\displaystyle=\bigcap_{1\leq i}\mathop{\mathcal{MK}}\nolimits^{i}(E),

where 𝒦i(0i)\mathop{\mathcal{MK}}\nolimits^{i}\ (0\leq i) is defined as

𝒦0(E)\displaystyle\mathop{\mathcal{MK}}\nolimits^{0}(E) =E,and\displaystyle=E,\ \text{and}
𝒦i+1E\displaystyle\mathop{\mathcal{MK}}\nolimits^{i+1}E =𝒦(𝒦i(E)).\displaystyle=\mathop{\mathcal{MK}}\nolimits(\mathop{\mathcal{MK}}\nolimits^{i}(E)).

Equivalently, we can define 𝒞𝒦\mathop{\mathcal{CK}}\nolimits as 𝒞𝒦(E)={wWR+(w)E}\mathop{\mathcal{CK}}\nolimits(E)=\{w\in W\mid R^{+}(w)\subseteq E\} where R+R^{+} is the transitive closure of pPRp\bigcup_{p\in P}R_{p}. Note that there is no constant upper bound on the depth of the recursive definition of 𝒞𝒦\mathop{\mathcal{CK}}\nolimits.

Recall that in a world ww, she knows only that she is in one of the worlds of Rp(w)R_{p}(w). If w𝒦p(E)w\in\mathop{\mathcal{K}}\nolimits_{p}(E), then Rp(w)ER_{p}(w)\subseteq E holds from the definition of 𝒦p\mathop{\mathcal{K}}\nolimits_{p}. Hence, in a world w𝒦p(E)w\in\mathop{\mathcal{K}}\nolimits_{p}(E), player pp knows that she is in one of the worlds of EE. When w𝒦p(E)w\in\mathop{\mathcal{K}}\nolimits_{p}(E), we say that player pp knows the event EE occurs in ww, or simply pp knows EE in ww. The set 𝒦(E)\mathop{\mathcal{MK}}\nolimits(E) is the event such that all players know the event EE. If w𝒦(E)w\in\mathop{\mathcal{MK}}\nolimits(E), we say that the event EE is mutual knowledge in ww.

If all players know an event EE and all players know that all players know the event EE, and all players know that all players know that all players know the event EE and so on, we say that EE is common knowledge. The set 𝒞𝒦(E)\mathop{\mathcal{CK}}\nolimits(E) is the event where EE is common knowledge. If w𝒞𝒦(E)w\in\mathop{\mathcal{CK}}\nolimits(E), we say that the event EE is common knowledge in ww.

Epistemic model

Definition 5

Let G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta) be a game arena and S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. An epistemic model for GG and SS is a tuple (W,(Rp)pP,(σp)pP)(W,(R_{p})_{p\in P},(\sigma_{p})_{p\in P}) where (W,(Rp)pP)(W,(R_{p})_{p\in P}) is a KT5 Kripke frame and σp:WS\sigma_{p}:W\to S is a function such that

w,wW.((w,w)Rpσp(w)=σp(w)).\forall w,w^{\prime}\in W.\ \left((w,w^{\prime})\in R_{p}\implies\sigma_{p}(w)=\sigma_{p}(w^{\prime})\right). (1)

Condition (1) guarantees that each player takes the same strategy in the worlds she cannot distinguish. For a game arena GG and a class of strategies S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\}, let M(G,S)M(G,S) be the set of all epistemic models for GG and SS.

Example 2

Let GG be a game arena defined in Example 1. Let spR,spL𝖯𝗈𝗌ps^{R}_{p},s^{L}_{p}\in\mathsf{Pos}_{p} be the positional strategies for player pp defined as spR(h)=v(p1)mod3s^{R}_{p}(h)=v_{(p-1)\bmod 3} and spL(h)=v(p+1)mod3s^{L}_{p}(h)=v_{(p+1)\bmod 3} for all h𝐻𝑖𝑠𝑡ph\in\mathit{Hist}_{p}. Let M=(W,(Rp)pP,𝝈)M(G,𝖯𝗈𝗌)M=(W,(R_{p})_{p\in P},\bm{\sigma})\in M(G,\mathsf{Pos}) be an epistemic model for GG and 𝖯𝗈𝗌\mathsf{Pos}, where W={RRR,RRL,RLRW=\{RRR,RRL,RLR, RLL,LRRRLL,LRR, LRL,LLR,LLL}LRL,LLR,LLL\}, Rp={(X0X1X2,Y0Y1Y2)R_{p}=\{(X_{0}X_{1}X_{2},Y_{0}Y_{1}Y_{2})\mid Xi,Yi{R,L}X_{i},Y_{i}\in\{R,L\} for 0i20\leq i\leq 2 and Xp=Yp}X_{p}=Y_{p}\}, 𝝈(XYZ)=(s0X,s1Y,s2Z)\bm{\sigma}(XYZ)=(s^{X}_{0},s^{Y}_{1},s^{Z}_{2}) for all X,Y,Z{R,L}X,Y,Z\in\{R,L\}.

RRRRR{}RRRLRR{}LLRRLR{}RLRLLR{}LRLRRL{}RRLLRL{}LLLRLL{}RLLLLL{}L
Figure 2: Equivalence classes in WW

Figure 2 shows the equivalence classes in WW. The solid, dashed and densely dotted lines divide WW into the equivalence classes of R0R_{0}, R1R_{1} and R2R_{2} respectively. Let E={RRR,RRL,RLR,RLL}E=\{RRR,RRL,RLR,RLL\} be the event such that player 0 takes the strategy s0Rs^{R}_{0}. Then, player 0 knows EE in each world of EE because 𝒦0(E)=E\mathop{\mathcal{K}}\nolimits_{0}(E)=E. On the other hand, players 11 and 22 never know EE in any world of WW because 𝒦1(E)=𝒦2(E)=\mathop{\mathcal{K}}\nolimits_{1}(E)=\mathop{\mathcal{K}}\nolimits_{2}(E)=\varnothing. The event WW is both mutual knowledge and common knowledge in all worlds because it holds that 𝒦(W)=𝒞𝒦(W)=W\mathop{\mathcal{MK}}\nolimits(W)=\mathop{\mathcal{CK}}\nolimits(W)=W.

Rationality

Definition 6

Let G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta) be a game arena, 𝜶\bm{\alpha} be an objective profile, S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles and M=(W,(Rp)pP,𝝈)M=(W,(R_{p})_{p\in P},\bm{\sigma}) be an epistemic model for GG and SS. For a world wWw\in W and a player pPp\in P, if there is no pp’s strategy spSs_{p}\in S such that

wRp(w).(pWin(𝜶,𝝈(w))pWin(𝜶,𝝈(w)[psp])),and\displaystyle\forall w^{\prime}\in R_{p}(w).\ \left(p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{\sigma}(w^{\prime}))\implies p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{\sigma}(w^{\prime})[p\mapsto s_{p}])\right),\text{and} (2)
wRp(w).(pWin(𝜶,𝝈(w))pWin(𝜶,𝝈(w)[psp])),\displaystyle\exists w^{\prime}\in R_{p}(w).\ \left(p\notin\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{\sigma}(w^{\prime}))\wedge p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{\sigma}(w^{\prime})[p\mapsto s_{p}])\right), (3)

then pp is rational111The rationality in Definition 6 is called the strong notion of rationality [3]. in ww.

We write the set of all worlds where pp is rational as 𝑅𝐴𝑇G,𝜶,M,SpW\mathit{RAT}^{p}_{G,\bm{\alpha},M,S}\subseteq W. The set of all worlds where each player is rational is written as 𝑅𝐴𝑇G,𝜶,M,S=pP𝑅𝐴𝑇G,𝜶,M,Sp\mathit{RAT}_{G,\bm{\alpha},M,S}=\bigcap_{p\in P}\mathit{RAT}^{p}_{G,\bm{\alpha},M,S}.

Characterization of the notion of common knowledge of rationality

Definition 7

Let G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta) be a game arena, 𝜶\bm{\alpha} be an objective profile, S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. We define a characterization TG,𝛂,SST_{G,\bm{\alpha},S}\subseteq S of the notion of common knowledge of rationality for GG, 𝛂\bm{\alpha} and SS as

TG,𝜶,S={𝒔S\displaystyle T_{G,\bm{\alpha},S}=\{\bm{s}\in S\mid M=(W,(Rp)pP,𝝈)M(G,S).\displaystyle\exists M=(W,(R_{p})_{p\in P},\bm{\sigma})\in M(G,S).
wW.(w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,S𝝈(w)=𝒔)}.\displaystyle\exists w\in W.\ (w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,\bm{\alpha},M,S}\wedge\bm{\sigma}(w)=\bm{s})\}.
Lemma 2

Let TG,𝛂,ST_{G,\bm{\alpha},S} be a characterization for a game arena G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta), an objective profile 𝛂\bm{\alpha} and a class SS of strategy profiles. If pPp\in P has a winning strategy, then pp is a winner, namely pWin(𝛂,𝐭)p\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha},\bm{t}) for all 𝐭TG,𝛂,S\bm{t}\in T_{G,\bm{\alpha},S}.

Proof

Assume pPp\in P has a winning strategy. Let 𝒕\bm{t} be an arbitrary strategy profile in TG,𝜶,ST_{G,{\bm{\alpha}},S}. By the definition of TG,𝜶,ST_{G,{\bm{\alpha}},S}, there exist M=(W,(Rp)pP,𝝈)M(G,S)M=(W,(R_{p})_{p\in P},{\bm{\sigma}})\in M(G,S) and wWw\in W such that w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,Sw\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,{\bm{\alpha}},M,S} and 𝒕=𝝈(w)\bm{t}={\bm{\sigma}}(w). Since wR+(w)w\in R^{+}(w), w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,Sw\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,{\bm{\alpha}},M,S} implies w𝑅𝐴𝑇G,𝜶,M,Sw\in\mathit{RAT}_{G,{\bm{\alpha}},M,S}. If pp is a loser under 𝒕\bm{t}, namely pWin(𝜶,𝝈(w))p\notin\mathop{\mathrm{Win}}\nolimits({\bm{\alpha}},{\bm{\sigma}}(w)), then pp is not rational in ww because her winning strategy satisfies both conditions (2) and (3) in Definition 6 by letting w=ww^{\prime}=w in condition (3). This contradicts w𝑅𝐴𝑇G,𝜶,M,Sw\in\mathit{RAT}_{G,{\bm{\alpha}},M,S}, and thus, pp should win under 𝒕\bm{t}. ∎

Example 3 (continued)

Let GG and MM be the game arena and the epistemic model respectively, defined in Example 2. Let 𝜶=(Op)pP\bm{\alpha^{\prime}}=(O_{p})_{p\in P} be an objective profile where Op=Bu¨chi({vp})O_{p}=\mathop{\mathrm{B\ddot{u}chi}}\nolimits(\{v_{p}\}). Then, it holds that 𝑅𝐴𝑇G,𝜶,M,𝖯𝗈𝗌=W\mathit{RAT}_{G,\bm{\alpha^{\prime}},M,\mathsf{Pos}}=W because for any world wWw\in W, player pp and pp’s positional strategy sp𝖯𝗈𝗌ps_{p}\in\mathsf{Pos}_{p}, the condition (2) in Definition 6 holds but (3) does not hold. For example, let w=RRRw=RRR, p=0p=0 and sp=s0Ls_{p}=s^{L}_{0}. Note that by the structure of GG and 𝜶\bm{\alpha^{\prime}}, player 0 loses if and only if players 11 and 22 takes s1Ls_{1}^{L} and s2Rs_{2}^{R} (regardless of the strategy of player 0). Hence, 0Win(𝜶,𝝈(w))0Win(𝜶,𝝈(w)[0s0L])0\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha^{\prime}},\bm{\sigma}(w^{\prime}))\iff 0\in\mathop{\mathrm{Win}}\nolimits(\bm{\alpha^{\prime}},\bm{\sigma}(w^{\prime})[0\mapsto s^{L}_{0}]) for all wR0(RRR)w^{\prime}\in R_{0}(RRR). By 𝑅𝐴𝑇G,𝜶,M,𝖯𝗈𝗌=W\mathit{RAT}_{G,\bm{\alpha^{\prime}},M,\mathsf{Pos}}=W, it is easy to see that 𝒞𝒦𝑅𝐴𝑇G,𝜶,M,𝖯𝗈𝗌=W\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,\bm{\alpha^{\prime}},M,\mathsf{Pos}}=W from the structure of MM, and hence TG,𝜶,𝖯𝗈𝗌={𝝈(w)wW}=𝖯𝗈𝗌T_{G,\bm{\alpha^{\prime}},\mathsf{Pos}}=\{\bm{\sigma}(w)\mid w\in W\}=\mathsf{Pos}.

Restriction of epistemic models

So far, we have made no assumptions about the size of an epistemic model and hence there could be an epistemic model whose size is extremely large. An epistemic model represents a structure of information that players have. It is unnatural to assume that players can use extremely large information within a limited time or a limited computation power. Therefore, we assume that there is a polynomial p(n)p(n) such that the size of a given epistemic model is not greater than p(n)p(n) where nn is the size of a given game arena.

Let GG be a game arena and S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} be a class of strategy profiles. We write the set of all epistemic models for GG and SS whose size is not greater than p(n)p(n) for some polynomial pp as M𝖯(G,S)M_{\mathsf{P}}(G,S) where nn is the size of GG. Let 𝜶\bm{\alpha} be an objective profile. We also define a characterization T𝖯,G,𝜶,SST_{\mathsf{P},G,\bm{\alpha},S}\subseteq S as

T𝖯,G,𝜶,S={𝒔S\displaystyle T_{\mathsf{P},G,\bm{\alpha},S}=\{\bm{s}\in S\mid\exists M=(W,(Rp)pP,𝝈)M𝖯(G,S).\displaystyle M=(W,(R_{p})_{p\in P},\bm{\sigma})\in M_{\mathsf{P}}(G,S).
wW.(w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,S𝝈(w)=𝒔)}.\displaystyle\exists w\in W.\ (w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,\bm{\alpha},M,S}\wedge\bm{\sigma}(w)=\bm{s})\}.

4 Verification Problems with Common Knowledge of Rationality

We define three types of rational verification problems. The first two of them are defined based on epistemic rationality while the last one is defined based on Nash equilibrium (NE). The second problem is a variant of the first problem, where the size of an epistemic model is not greater than a polynomial size of a game arena. We start with the analysis of the last problem because NE is easier to analyze than epistemic rationality.

Problem 2

We define verification problems with common knowledge of rationality (VPCKR) as

𝑉𝑃𝐶𝐾𝑅S\displaystyle\mathit{VPCKR}_{S} ={G,𝜶,O𝒕TG,𝜶,S.out(𝒕)O},\displaystyle=\{\langle G,\bm{\alpha},O\rangle\mid\forall\bm{t}\in T_{G,\bm{\alpha},S}.\ \mathrm{out}(\bm{t})\in O\},
𝑉𝑃𝐶𝐾𝑅𝖯,S\displaystyle\mathit{VPCKR}_{\mathsf{P},S} ={G,𝜶,O𝒕T𝖯,G,𝜶,S.out(𝒕)O}, and\displaystyle=\{\langle G,\bm{\alpha},O\rangle\mid\forall\bm{t}\in T_{\mathsf{P},G,\bm{\alpha},S}.\ \mathrm{out}(\bm{t})\in O\},\text{ and}
𝑉𝑃𝑁𝑎𝑠ℎS\displaystyle\mathit{VPNash}_{S} ={G,𝜶,O𝒔S.𝑁𝑎𝑠ℎ(𝒔,𝜶,S)out(𝒔)O}\displaystyle=\{\langle{G,{\bm{\alpha}},O}\rangle\mid\forall{\bm{s}}\in S.\,\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},S)\implies\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O\}

where G=(P,V,(Vp)pP,v0,Δ)G=(P,V,\allowbreak(V_{p})_{p\in P},\allowbreak v_{0},\allowbreak\Delta) is a game arena, 𝜶\bm{\alpha} is an objective profile over Muller objectives, O𝑃𝑙𝑎𝑦O\subseteq\mathit{Play} is a specification given by a Muller objective and S{𝖯𝗈𝗌,𝖲𝗍𝗋}S\in\{\mathsf{Pos},\mathsf{Str}\} is a class of strategy profiles.

Example 4 (continued)

Let GG and 𝜶=(Op)pP\bm{\alpha^{\prime}}=(O_{p})_{p\in P} be the game arena and the objective profile, respectively defined in Example 3. Let OO be the specification defined as O=pPOpO=\bigcap_{p\in P}O_{p}. Recall that TG,𝜶,𝖯𝗈𝗌=𝖯𝗈𝗌T_{G,\bm{\alpha^{\prime}},\mathsf{Pos}}=\mathsf{Pos}. Then, G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\langle G,\bm{\alpha^{\prime}},O\rangle\notin\mathit{VPCKR}_{\mathsf{Pos}} because there is the strategy profile 𝒔𝟑=(s0R,s1R,s2L)TG,𝜶,𝖯𝗈𝗌\bm{s_{3}}=(s^{R}_{0},s^{R}_{1},s^{L}_{2})\in T_{G,\bm{\alpha^{\prime}},\mathsf{Pos}} such that out(𝒔𝟑)O\mathrm{out}(\bm{s_{3}})\notin O.

Note that 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} is not a restricted problem of 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S}, because the fact that G,𝜶,O𝑉𝑃𝐶𝐾𝑅S\langle{G,{\bm{\alpha}},O}\rangle\notin\mathit{VPCKR}_{S} (i.e. there is some 𝒕TG,𝜶,S\bm{t}\in T_{G,{\bm{\alpha}},S} that satisfies out(𝒕)O\mathop{\mathrm{out}}\nolimits(\bm{t})\notin O) gives no information on whether every 𝒕T𝖯,G,𝜶,S\bm{t}\in T_{\mathsf{P},G,{\bm{\alpha}},S} satisfies out(𝒕)O\mathop{\mathrm{out}}\nolimits(\bm{t})\in O or not. For the same reason, 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S} is not a restricted problem of 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S} or 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S}. On the other hand, we can say that 𝑉𝑃𝐶𝐾𝑅S𝑉𝑃𝐶𝐾𝑅𝖯,S𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPCKR}_{S}\subseteq\mathit{VPCKR}_{\mathsf{P},S}\subseteq\mathit{VPNash}_{S} holds, because every NE belongs to T𝖯,G,𝜶,ST_{\mathsf{P},G,\bm{\alpha},S}222 For every NE 𝒔S{\bm{s}}\in S, the epistemic model with a single world ww where 𝝈(w)=𝒔{\bm{\sigma}}(w)={\bm{s}} satisfies w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,Sw\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,{\bm{\alpha}},M,S}. and T𝖯,G,𝜶,STG,𝜶,ST_{\mathsf{P},G,{\bm{\alpha}},S}\subseteq T_{G,{\bm{\alpha}},S}. Also note that 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}}, 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}, and 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} are not restricted problems of 𝑉𝑃𝐶𝐾𝑅𝖲𝗍𝗋\mathit{VPCKR}_{\mathsf{Str}}, 𝑉𝑃𝐶𝐾𝑅𝖯,𝖲𝗍𝗋\mathit{VPCKR}_{\mathsf{P},\mathsf{Str}}, and 𝑉𝑃𝑁𝑎𝑠ℎ𝖲𝗍𝗋\mathit{VPNash}_{\mathsf{Str}}, respectively. Moreover, because 𝒕TG,𝜶,𝖯𝗈𝗌\bm{t}\in T_{G,{\bm{\alpha}},\mathsf{Pos}} does not imply 𝒕TG,𝜶,𝖲𝗍𝗋\bm{t}\in T_{G,{\bm{\alpha}},\mathsf{Str}}, 𝑉𝑃𝐶𝐾𝑅𝖲𝗍𝗋𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Str}}\not\subseteq\mathit{VPCKR}_{\mathsf{Pos}} in general. (For the same reason, 𝑉𝑃𝐶𝐾𝑅𝖯,𝖲𝗍𝗋𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Str}}\not\subseteq\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} and 𝑉𝑃𝑁𝑎𝑠ℎ𝖲𝗍𝗋𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Str}}\not\subseteq\mathit{VPNash}_{\mathsf{Pos}} in general.)

Before investigating the complexity of 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} and 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S}, we mention the complexity of 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S}. As described in Introduction, 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S} is closely related to rational synthesis (RS), and with the class of Muller objectives (which is closed under negation and the negation does not cause exponential blow-up), we can easily show that the complexity of RS with the class SS of strategy profiles is the same as that of 𝑉𝑃𝑁𝑎𝑠ℎS¯\overline{\mathit{VPNash}_{S}}. By the results of [9], we have the following proposition for 𝑉𝑃𝑁𝑎𝑠ℎ𝖲𝗍𝗋\mathit{VPNash}_{\mathsf{Str}}.

Proposition 1

𝑉𝑃𝑁𝑎𝑠ℎ𝖲𝗍𝗋\mathit{VPNash}_{\mathsf{Str}} is PSPACE-complete.

Although the complexity of the same problem with positional strategies was not studied in [9], we can show that 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} is Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}-complete as follows.

Lemma 3

𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} is Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}-hard.

Proof

We reduce 𝖲𝖠𝖳{\forall\exists\mathsf{SAT}} to 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}}. Let φ=x1xny1ymψ\varphi=\forall x_{1}\ldots x_{n}\exists y_{1}\ldots y_{m}\,\psi be an instance of 𝖲𝖠𝖳{\forall\exists\mathsf{SAT}}, where x1,,xnx_{1},\ldots,x_{n}, y1,,ymy_{1},\ldots,y_{m} are variables and ψ\psi is a Boolean formula over them. From φ\varphi, we construct an instance G,𝜶,O\langle{G,{\bm{\alpha}},O}\rangle of 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} where G=({A,E},VAVE,(VA,VE),a1,Δ)G=(\{A,E\},V_{A}\cup V_{E},(V_{A},V_{E}),a_{1},\Delta) as follows.

VA\displaystyle V_{A} ={a1,,an,x1,,xn,x1¯,,xn¯},\displaystyle=\{a_{1},\ldots,a_{n},\,x_{1},\ldots,x_{n},\,\overline{x_{1}},\ldots,\overline{x_{n}}\},
VE\displaystyle V_{E} ={e1,,em,y1,,xm,y1¯,,ym¯},\displaystyle=\{e_{1},\ldots,e_{m},\,y_{1},\ldots,x_{m},\,\overline{y_{1}},\ldots,\overline{y_{m}}\},
Δ\displaystyle\Delta ={(ai,u)1in,u{xi,xi¯}}\displaystyle=\{(a_{i},u)\mid 1\leq i\leq n,\ u\in\{x_{i},\overline{x_{i}}\}\}
{(u,ai+1)1i<n,u{xi,xi¯}}{(u,e1)u{xn,xn¯}}\displaystyle\>{}\cup\{(u,a_{i+1})\mid 1\leq i<n,\ u\in\{x_{i},\overline{x_{i}}\}\}\cup\{(u,e_{1})\mid u\in\{x_{n},\overline{x_{n}}\}\}
{(ei,u)1in,u{yi,yi¯}}\displaystyle\>{}\cup\{(e_{i},u)\mid 1\leq i\leq n,\ u\in\{y_{i},\overline{y_{i}}\}\}
{(u,ei+1)1i<n,u{yi,yi¯}}{(u,a1)u{ym,ym¯}},\displaystyle\>{}\cup\{(u,e_{i+1})\mid 1\leq i<n,\ u\in\{y_{i},\overline{y_{i}}\}\}\cup\{(u,a_{1})\mid u\in\{y_{m},\overline{y_{m}}\}\},
𝜶\displaystyle{\bm{\alpha}} =(OA,OE) where OA=𝑡𝑟𝑢𝑒 and OE=ψ, and\displaystyle=(O_{\!A},O_{E})\text{ where }O_{\!A}=\mathit{true}\text{ and }O_{E}=\psi\text{, and}
O\displaystyle O =ψ.\displaystyle=\psi.

Figure 3 shows the game arena obtained from a formula over x1x_{1}, x2x_{2} and y1y_{1}.

a1a_{1}x1x_{1}x1¯\overline{x_{1}}a2a_{2}x2x_{2}x2¯\overline{x_{2}}e1e_{1}y1y_{1}y1¯\overline{y_{1}}
Figure 3: The game arena constructed from a formula having x1x_{1}, x2x_{2} and y1y_{1}

Note that we regard a Boolean formula ψ\psi as a Muller objective. For example, a formula ψ=x1x2¯\psi=x_{1}\lor\overline{x_{2}} is considered as the Muller objective such that a player whose objective is ψ\psi wins if the play visits vertex x1x_{1} infinitely often or the play visits x2x_{2} only finite times.

By the structure of GG, we can consider every strategy profile 𝒔𝖯𝗈𝗌{\bm{s}}\in\mathsf{Pos} over positional strategies as a truth assignment to the variables in φ\varphi ; choosing xix_{i} (resp. xi¯\overline{x_{i}}) as the next vertex at vertex aia_{i} corresponds to letting xi=𝑡𝑟𝑢𝑒x_{i}=\mathit{true} (resp. xi=𝑓𝑎𝑙𝑠𝑒x_{i}=\mathit{false}). The same for yiy_{i} and yi¯\overline{y_{i}}. The play out(𝒔)\mathop{\mathrm{out}}\nolimits({\bm{s}}) contains the chosen vertices infinitely many times while it does not contain any unchosen vertex. Therefore, out(𝒔)\mathop{\mathrm{out}}\nolimits({\bm{s}}) satisfies ψ\psi as a Muller objective if and only if ψ\psi is true under the truth assignment represented by 𝒔{\bm{s}}.

We show that φ𝖲𝖠𝖳G,𝜶,O𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\varphi\in{\forall\exists\mathsf{SAT}}\iff\langle{G,{\bm{\alpha}},O}\rangle\in\mathit{VPNash}_{\mathsf{Pos}}.

(\Longrightarrow)  Assume that φ𝖲𝖠𝖳\varphi\in{\forall\exists\mathsf{SAT}}. We have to show 𝒔𝖯𝗈𝗌.𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\forall{\bm{s}}\in\mathsf{Pos}.\,\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}) out(𝒔)O\implies\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O. Assume that 𝒔=(sA,sE)𝖯𝗈𝗌{\bm{s}}=(s_{\!A},s_{E})\in\mathsf{Pos} and 𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}). Since φ𝖲𝖠𝖳\varphi\in{\forall\exists\mathsf{SAT}}, for the truth assignment represented by sAs_{\!A}, there must exist a truth assignment to y1,,ymy_{1},\ldots,y_{m} that makes ψ\psi true. Let sEs^{\prime}_{E} denote the positional strategy of EE corresponding to this assignment; hence, out(𝒔[EsE])OE\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[E\mapsto s^{\prime}_{E}])\in O_{E}. On the other hand, by the definition of NE, either out(𝒔)OE\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O_{E} or out(𝒔[EsE])OE\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[E\mapsto s^{\prime}_{E}])\notin O_{E} should hold for any positional strategy sEs^{\prime}_{E} of EE. As shown above, the latter does not hold. Therefore, the former holds and thus out(𝒔)O\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O since O=OEO=O_{E}.

(\Longleftarrow)  Assume that 𝒔𝖯𝗈𝗌.𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)out(𝒔)O\forall{\bm{s}}\in\mathsf{Pos}.\,\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos})\implies\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O. Let sAs_{\!A} be an arbitrary positional strategy of AA. We have to show that for sAs_{\!A}, there exists a positional strategy sEs_{E} of EE such that the truth assignment corresponding to (sA,sE)(s_{\!A},s_{E}) makes ψ\psi true. Let sEs_{E} be an arbitrary chosen positional strategy of EE and let 𝒔=(sA,sE){\bm{s}}=(s_{\!A},s_{E}). By the assumption, out(𝒔)O\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O holds or 𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}) does not hold. If out(𝒔)O\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O holds, then sEs_{E} is just the desired strategy that makes ψ\psi true. If 𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}) does not hold, then there exist a player pp and a positional strategy ss^{\prime} of pp that satisfy out(𝒔)Op\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O_{p} and out(𝒔[ps])Op\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[p\mapsto s^{\prime}])\in O_{p}. We have p=Ep=E because out(𝒔)OA\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O_{\!A} never holds. Since out(𝒔[Es])OE\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[E\mapsto s^{\prime}])\in O_{E}, ss^{\prime} is the desired strategy that makes ψ\psi true. ∎

Lemma 4

𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} is in Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}.

Proof

By Lemma 1, deciding whether out(𝒔)O\mathop{\mathrm{out}}\nolimits({\bm{s}})\in O is in 𝖯\mathsf{P} for a given game, a Muller objective OO, and a strategy profile 𝒔{\bm{s}} over positional strategies. Deciding whether 𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}) for a given game and 𝒔{\bm{s}} is in coNP\text{co}\mathrm{NP}: Guess a player pp and a positional strategy sps_{p} of pp, and check whether out(𝒔)Op\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O_{p} and out(𝒔[psp])Op\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[p\mapsto s_{p}])\in O_{p}. Let AA be an oracle of this problem. Using AA, we can construct a non-deterministic polynomial-time oracle Turing machine for deciding whether G,𝜶,O𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\langle{G,{\bm{\alpha}},O}\rangle\notin\mathit{VPNash}_{\mathsf{Pos}}: Guess 𝒔{\bm{s}} and check whether 𝑁𝑎𝑠ℎ(𝒔,𝜶,𝖯𝗈𝗌)\mathop{\mathit{Nash}}\nolimits({\bm{s}},{\bm{\alpha}},\mathsf{Pos}) and out(𝒔)O\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O. Therefore, 𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} is in coNPNP=Π2𝖯\text{co}\mathrm{NP}^{\mathrm{NP}}=\mathrm{\Pi}^{\mathsf{P}}_{2}. ∎

Theorem 4.1

𝑉𝑃𝑁𝑎𝑠ℎ𝖯𝗈𝗌\mathit{VPNash}_{\mathsf{Pos}} is Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}-complete.

Proof

By Lemmas 3 and 4. ∎

Next, let us consider the complexity of 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S} and 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S}. In this paper, we concentrate on the complexity of 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}} and 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}. For 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}, we can show that it is coNP-hard and in Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2} as follows.

Theorem 4.2

𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} is coNP-hard.

Proof

We reduce 𝖲𝖠𝖳¯\overline{\mathsf{SAT}}, the complement of the satisfiability problem, to 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}. Let φ\varphi be a Boolean formula given as an instance of 𝖲𝖠𝖳¯\overline{\mathsf{SAT}}, where x1,xnx_{1},\dots x_{n} are variables of φ\varphi. From φ\varphi, we construct an instance G,𝜶,O\langle{G,{\bm{\alpha}},O}\rangle of 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} where G=({p},V,(V),v1,Δ)G=(\{p\},V,(V),v_{1},\Delta) as follows.

V\displaystyle V ={v1,,vn,x1,,xn,x1¯,,xn¯},\displaystyle=\{v_{1},\dots,v_{n},\,x_{1},\dots,x_{n},\,\overline{x_{1}},\dots,\overline{x_{n}}\},
Δ\displaystyle\Delta ={(vi,u)1in,u{xi,xi¯}}\displaystyle=\{(v_{i},u)\mid 1\leq i\leq n,\,u\in\{x_{i},\overline{x_{i}}\}\}
{(u,vi+1)1i<n,u{xi,xi¯}}\displaystyle\>{}\cup\{(u,v_{i+1})\mid 1\leq i<n,\,u\in\{x_{i},\overline{x_{i}}\}\}
{(u,v1)u{xi,xi¯}},\displaystyle\>{}\cup\{(u,v_{1})\mid u\in\{x_{i},\overline{x_{i}}\}\},
𝜶\displaystyle{\bm{\alpha}} =(Op) where Op=φ, and\displaystyle=(O_{p})\text{ where }O_{p}=\varphi,\,\text{ and }
O\displaystyle O =¬φ.\displaystyle=\neg\varphi.

We regard a Boolean formula φ\varphi as a Muller objective and consider every strategy profile 𝒔𝖯𝗈𝗌\bm{s}\in\mathsf{Pos} over positional strategies as a truth assignment to the variables in φ\varphi in the same way as the proof of Lemma 3.

We show that φ𝖲𝖠𝖳¯G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\varphi\in\overline{\mathsf{SAT}}\iff\langle{G,{\bm{\alpha}},O}\rangle\in\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}.

(\Longrightarrow)  Assume φ𝖲𝖠𝖳¯\varphi\in\overline{\mathsf{SAT}}. Because φ\varphi is unsatisfiable, any strategy profile 𝒔𝖯𝗈𝗌\bm{s}\in\mathsf{Pos} satisfies out(𝒔)O\mathrm{out}(\bm{s})\in O (=¬φ=\neg\varphi). Therefore, 𝒕T𝖯,G,𝜶,𝖯𝗈𝗌.out(𝒕)O\forall\bm{t}\in T_{\mathsf{P},G,\bm{\alpha},\mathsf{Pos}}.\ \mathrm{out}(\bm{t})\in O holds.

(\Longleftarrow)  Assume 𝒕T𝖯,G,𝜶,𝖯𝗈𝗌.out(𝒕)O\forall\bm{t}\in T_{\mathsf{P},G,{\bm{\alpha}},\mathsf{Pos}}.\ \mathrm{out}(\bm{t})\in O. Note that φ𝖲𝖠𝖳¯\varphi\in\overline{\mathsf{SAT}} is equivalent to 𝒔𝖯𝗈𝗌.out(𝒔)O\forall\bm{s}\in\mathsf{Pos}.\ \mathrm{out}(\bm{s})\in O. We show this by contradiction. Suppose 𝒔𝖯𝗈𝗌\bm{s}\in\mathsf{Pos} and out(𝒔)O\mathrm{out}(\bm{s})\notin O. Consider an epistemic model M=({w},(Rp)pP,𝝈)M=(\{w\},(R_{p})_{p\in P},\bm{\sigma}) where Rp={(w,w)}R_{p}=\{(w,w)\} and 𝝈(w)=𝒔\bm{\sigma}(w)=\bm{s}. Because out(𝒔)O\mathrm{out}(\bm{s})\notin O, player pp wins in world ww. It is easy to see that w𝒞𝒦𝑅𝐴𝑇w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT} because of the structure of MM. Therefore, 𝒔T𝖯,G,𝜶,𝖯𝗈𝗌\bm{s}\in T_{\mathsf{P},G,\bm{\alpha},\mathsf{Pos}}. By the outer assumption, out(𝒔)O\mathrm{out}(\bm{s})\in O, which contradicts the inner assumption out(𝒔)O\mathrm{out}(\bm{s})\notin O. ∎

Theorem 4.3

𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} is in Π2𝖯\mathrm{\Pi}^{\mathsf{P}}_{2}.

Proof

In a similar way to the proof of Lemma 4, we can construct a non-deterministic polynomial-time oracle Turing machine for deciding 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌¯\overline{\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}}}. Deciding whether w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,𝖯𝗈𝗌w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,\bm{\alpha},M,\mathsf{Pos}} for a given game, an epistemic model M=(W,(Rp)pP,𝝈)M=(W,(R_{p})_{p\in P},\bm{\sigma}) and a world wWw\in W is in coNP\text{co}\mathrm{NP}: Guess a world uR+(w)u\in R^{+}(w), a player pp and a positional strategy sp𝖯𝗈𝗌ps_{p}\in\mathsf{Pos}_{p} of pp, and check both conditions (2) and (3) in Definition 6 substituting uu for ww. A non-deterministic polynomial-time oracle Turing machine for deciding whether G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\langle{G,{\bm{\alpha}},O}\rangle\notin\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} is as follows: Guess an epistemic model M=(W,(Rp)pP,𝝈)M𝖯(G,S)M=(W,(R_{p})_{p\in P},\bm{\sigma})\in M_{\mathsf{P}}(G,S) and a world wWw\in W, and check whether w𝒞𝒦𝑅𝐴𝑇G,𝜶,M,𝖯𝗈𝗌w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,\bm{\alpha},M,\mathsf{Pos}} and out(𝝈(w))O\mathrm{out}(\bm{\sigma}(w))\notin O. Note that because the size of MM is not greater than some polynomial of the size of GG, the construction of MM takes only polynomial time. Therefore, 𝑉𝑃𝐶𝐾𝑅𝖯,𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{P},\mathsf{Pos}} is in coNPNP=Π2𝖯\text{co}\mathrm{NP}^{\mathrm{NP}}=\mathrm{\Pi}^{\mathsf{P}}_{2}. ∎

For 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}}, we can show that it is Σ2𝖯\mathrm{\Sigma}^{\mathsf{P}}_{2}-hard and in coNEXPNP\text{co}\mathrm{NEXP}^{\mathrm{NP}} as follows.

Theorem 4.4

𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}} is Σ2𝖯\mathrm{\Sigma}^{\mathsf{P}}_{2}-hard.

Proof

We reduce 𝖲𝖠𝖳{\exists\forall\mathsf{SAT}} to 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}}. Let φ=y1ymx1xnψ\varphi=\exists y_{1}\dots y_{m}\,\forall x_{1}\dots x_{n}\,\psi be an instance of 𝖲𝖠𝖳{\exists\forall\mathsf{SAT}}. From φ\varphi, we construct an instance G,𝜶,O\langle{G,{\bm{\alpha}},O}\rangle of 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}} where GG is the same game arena as in the proof of Lemma 3 and O=ψO=\psi and 𝜶{\bm{\alpha}} consists of OA=¬ψO_{A}=\neg\psi and OE=ψO_{E}=\psi. As described in the proof of Lemma 3, there is a one-to-one correspondence between the strategy profiles in 𝖯𝗈𝗌\mathsf{Pos} and the truth assignments to the variables in φ\varphi, and player EE wins under a strategy profile 𝒔{\bm{s}} if and only if ψ\psi is true under the truth assignment corresponding to 𝒔{\bm{s}}. Therefore by the structure of φ\varphi, φ\varphi is true if and only if EE has a winning strategy.

We show that φ𝖲𝖠𝖳G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\varphi\in{\exists\forall\mathsf{SAT}}\iff\langle{G,{\bm{\alpha}},O}\rangle\in\mathit{VPCKR}_{\mathsf{Pos}}.

(\Longrightarrow)  As mentioned above, φ𝖲𝖠𝖳\varphi\in{\exists\forall\mathsf{SAT}} implies EE has a winning strategy. By Lemma 2, EE wins for every 𝒕TG,𝜶,𝖯𝗈𝗌\bm{t}\in T_{G,{\bm{\alpha}},\mathsf{Pos}} ; i.e., 𝒕TG,𝜶,𝖯𝗈𝗌.out(𝒕)OE\forall\bm{t}\in T_{G,{\bm{\alpha}},\mathsf{Pos}}.\,\mathop{\mathrm{out}}\nolimits(\bm{t})\in O_{E} (=O=O).

(\Longleftarrow)  Assume that φ𝖲𝖠𝖳\varphi\notin{\exists\forall\mathsf{SAT}}; that is, EE has no winning strategy. If AA has a winning strategy sAs_{\!A}, then any strategy profile 𝒔𝖯𝗈𝗌{\bm{s}}\in\mathsf{Pos} where AA takes sAs_{\!A} is an NE. Since every NE belongs to TG,𝜶,𝖯𝗈𝗌T_{G,{\bm{\alpha}},\mathsf{Pos}} and AA wins under 𝒔{\bm{s}}, it holds that 𝒔TG,𝜶,𝖯𝗈𝗌{\bm{s}}\in T_{G,{\bm{\alpha}},\mathsf{Pos}} and out(𝒕)O\mathop{\mathrm{out}}\nolimits(\bm{t})\notin O. Therefore, G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\langle{G,{\bm{\alpha}},O}\rangle\notin\mathit{VPCKR}_{\mathsf{Pos}}.

Consider the case where neither AA nor BB has a winning strategy. Let TT^{\infty} be the subset of strategy profiles obtained by the following iterative procedure called the iterated deletion of inferior profiles (IDIP) [3, Def. 9.8]: For a subset XX of strategy profiles and its member 𝒔=(sp)pPX{\bm{s}}=(s_{p})_{p\in P}\in X, 𝒔{\bm{s}} is inferior relative to XX if there exist a player pp and pp’s strategy tp𝖯𝗈𝗌pt_{p}\in\mathsf{Pos}_{p} such that

  1. 1.

    out(𝒔)Opout(𝒔[ptp])Op\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O_{p}\land\mathop{\mathrm{out}}\nolimits({{\bm{s}}}[p\mapsto t_{p}])\in O_{p}, and

  2. 2.

    out(𝒔)Opout(𝒔[ptp])Op\mathop{\mathrm{out}}\nolimits({\bm{s}}^{\prime})\in O_{p}\implies\mathop{\mathrm{out}}\nolimits({{\bm{s}}^{\prime}}[p\mapsto t_{p}])\in O_{p} for every 𝒔=(sp)pPX{\bm{s}}^{\prime}=(s^{\prime}_{p})_{p\in P}\in X such that sp=sps^{\prime}_{p}=s_{p}.

Let T0=𝖯𝗈𝗌T^{0}=\mathsf{Pos}. Ti+1T^{i+1} is the set obtained from TiT^{i} by removing all strategy profiles inferior relative to TiT^{i}. We repeat this construction until there is no inferior strategy profiles. Since 𝖯𝗈𝗌\mathsf{Pos} is finite, this procedure always halts. Moreover, since neither AA nor BB has a winning strategy and the game is zero-sum, there must exist a strategy profile 𝒔𝖯𝗈𝗌{\bm{s}}\in\mathsf{Pos} remaining in TT^{\infty} that satisfies out(𝒔)OE\mathop{\mathrm{out}}\nolimits({\bm{s}})\notin O_{E} (=O=O). As shown in [3, Proposition 9.4 (B)], 𝒔T{\bm{s}}\in T^{\infty} also belongs to TG,𝜶,𝖯𝗈𝗌T_{G,{\bm{\alpha}},\mathsf{Pos}}. Therefore, G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\langle{G,{\bm{\alpha}},O}\rangle\notin\mathit{VPCKR}_{\mathsf{Pos}}. ∎

Theorem 4.5

𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}} is in coNEXPNP\text{co}\mathrm{NEXP}^{\mathrm{NP}}.

Proof

For a subset XX of 𝖯𝗈𝗌\mathsf{Pos}, let M(X)=(W,(Rp)pP,𝝈)M(X)=(W,(R_{p})_{p\in P},{\bm{\sigma}}) be an epistemic model where W=XW=X and 𝝈(𝒔)=𝒔{\bm{\sigma}}({\bm{s}})={\bm{s}} for all 𝒔X{\bm{s}}\in X, and Rp={(w1,w2)σp(w1)=σp(w2)}R_{p}=\{(w_{1},w_{2})\mid\sigma_{p}(w_{1})=\sigma_{p}(w_{2})\} for pPp\in P. Using this construction of an epistemic model from a subset of strategy profiles, we can construct a non-deterministic exponential-time oracle Turing machine for deciding 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌¯\overline{\mathit{VPCKR}_{\mathsf{Pos}}} as follows: Let G,𝜶,O\langle{G,{\bm{\alpha}},O}\rangle be a given instance of 𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\mathit{VPCKR}_{\mathsf{Pos}}. Guess a subset T𝖯𝗈𝗌T\subseteq\mathsf{Pos} and 𝒕T{\bm{t}}\in T, and construct M(T)M(T). Then, check whether 𝒕{\bm{t}} (as a world of M(T)M(T)) satisfies 𝒕𝒞𝒦𝑅𝐴𝑇G,𝜶,M(T),𝖯𝗈𝗌{\bm{t}}\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,{\bm{\alpha}},M(T),\mathsf{Pos}} and out(𝒕)O\mathop{\mathrm{out}}\nolimits({\bm{t}})\notin O.

As mentioned in the proof of Theorem 4.3, deciding whether 𝒕𝒞𝒦𝑅𝐴𝑇{\bm{t}}\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT} for given M(T)M(T) and 𝒕{\bm{t}} is in coNP\text{co}\mathrm{NP}. The above Turing machine uses an NP\mathrm{NP} oracle to decide 𝒕𝒞𝒦𝑅𝐴𝑇{\bm{t}}\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}. Since the size of TT and M(T)M(T) is exponential to the size of GG in general, guessing TT and constructing M(T)M(T) take exponential time. Deciding whether out(𝒕)O\mathop{\mathrm{out}}\nolimits({\bm{t}})\notin O is in 𝖯\mathsf{P} by Lemma 1 and is not dominant.

If the answer of the Turing machine is yes, then obviously 𝒕TG,𝜶,𝖯𝗈𝗌{\bm{t}}\in T_{G,{\bm{\alpha}},\mathsf{Pos}} and thus G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌¯\langle{G,{\bm{\alpha}},O}\rangle\in\overline{\mathit{VPCKR}_{\mathsf{Pos}}}. On the other hand, [3, Proposition 9.4] and its proof say that the subset TT^{\infty} of strategy profiles obtained by the IDIP procedure [3, Def. 9.8] satisfies T=TG,𝜶,𝖯𝗈𝗌T^{\infty}=T_{G,{\bm{\alpha}},\mathsf{Pos}}, and every world ww of the epistemic model M(T)M(T^{\infty}) satisfies w𝒞𝒦𝑅𝐴𝑇G,𝜶,M(T),𝖯𝗈𝗌w\in\mathop{\mathcal{CK}}\nolimits\mathit{RAT}_{G,{\bm{\alpha}},M(T^{\infty}),\mathsf{Pos}}. Hence, we do not need to consider epistemic models other than M(T)M(T) for T𝖯𝗈𝗌T\subseteq\mathsf{Pos}. Therefore, if the answer of the above Turing machine is no, then we can conclude that G,𝜶,O𝑉𝑃𝐶𝐾𝑅𝖯𝗈𝗌\langle{G,{\bm{\alpha}},O}\rangle\in\mathit{VPCKR}_{\mathsf{Pos}}. ∎

5 Conclusion

We introduced an epistemic approach to rational verification. We defined rational verification problems 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S}, 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} and 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S} based on common knowledge of rationality and Nash equilibrium. The problem 𝑉𝑃𝐶𝐾𝑅𝖯,S\mathit{VPCKR}_{\mathsf{P},S} is a variant of 𝑉𝑃𝐶𝐾𝑅S\mathit{VPCKR}_{S} where the size of an epistemic model is not greater than p(n)p(n) for some polynomial pp and the size nn of a given game arena. The problem 𝑉𝑃𝑁𝑎𝑠ℎS\mathit{VPNash}_{S} asks whether each Nash equilibrium satisfies given specification. Then, we analyzed the complexities of these problems shown. Table 1 summarizes the complexities of the problems.

In this paper, we consider only for the S=𝖯𝗈𝗌S=\mathsf{Pos} case. Analysing above problems for other SS such as the class of the finite memory strategies is future work. Our epistemic model based on KT5 Kripke frame is the knowledge based setting. Hence, if a player knows EE in a world ww, then EE actually occurs in ww. There is another setting called belief based setting. In belief based setting, even if a player knows EE in ww, EE doesn’t necessarily occur in ww. Studying belief based setting is also future work.

References

  • [1] R. Bloem, K. Chatterjee and B. Jobstmann, Graph Games and Reactive Synthesis, E. M. Clarke et al. (eds.), Handbook of Model Checking, Chapter 27, 921–962, Springer, 2018.
  • [2] E. M. Clarke, O. Grumberg and D. A. Peled, Model Checking. MIT Press, 2001.
  • [3] G. Bonanno, Epistemic Foundations of Game Theory, H. van Ditmarsch et al. (eds.), Handbook of Epistemic Logic, Chapter 9, 411–450, College Publications, 2015.
  • [4] J. R. Büchi and L. H. Landweber, Solving sequential conditions by finite-state strategies, Trans. American Mathematical Society 138, 295–311, 1969.
  • [5] A. Pnueli and R. Rosner, On the synthesis of a reactive module, 16th ACM Symp. on Principles of Programming Languages (POPL 1989), 179–190.
  • [6] D. Fisman, O. Kupferman and O. Lustig, Rational synthesis, 16th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2010), LNCS 6015, 190–204.
  • [7] J. Gutierrez, M. Najib, G. Perelli and M. Wooldridge, On the complexity of rational verification, Annals of Mathematics and Artificial Intelligence 91, 409–430, 2023.
  • [8] M. Ummels, The complexity of Nash equilibria in infinite multiplayer games, 11th Int. Conf. on Foundations of Software Science and Computational Structures (FOSSACS 2008), LNCS 4962, 20–34.
  • [9] R. Condurache, E. Filiot, R. Gentilini and J-F. Raskin, The complexity of rational synthesis, 43rd Int. Colloq. on Automata, Languages, and Programming (ICALP 2016), LIPIcs 55, 121:1–121:15.
  • [10] S. Kremer and J.-F. Raskin, A game-based verification of non-repudiation and fair exchange protocols, 12th Int. Conf. on Concurrency Theory (CONCUR 2001), LNCS 2154, 551–565. extended version: J. Computer Security 11(3), 399–429, 2003.
  • [11] K. Chatterjee and V. Raman, Synthesizing protocols for digital contract signing, 13th Int. Conf. on Verification, Model checking, and Abstract Interpretation (VMCAI 2012), LNCS 7148, 152–168.
  • [12] O. Kupferman, G. Perelli and M. Vardi, Synthesis with rational environments, Annals of Mathematics and Artificial Intelligence 78(1), 3–20, 2016.
  • [13] O. Kupferman and N. Shenwald, The complexity of LTL rational synthesis, 28th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2022), LNCS 13243, 25–45.
  • [14] V. Bruyère, J.-F. Raskin and C. Tamines, Pareto-rational verification, 33rd Int. Conf. on Concurrency Theory (CONCUR 2022), LIPIcs.CONCUR.2022, 33:1–33:20.
  • [15] L. Brice, J.-F. Raskin and M. van den Bogaard, Rational verification for Nash and subgame-perfect equilibria in graph games, 48th Int. Symp. on Mathematical Foundations of Computer Science (MFCS 2023), 26:1–26:15.
  • [16] R. Aumann and A. Brandenburger, Epistemic conditions for Nash equilibrium, Econometrica: Journal of the Econometric Society, 1995, 1161–1180.
  • [17] B. Polak, Epistemic conditions for Nash equilibrium, and common knowledge of rationality, Econonetrica, 67.3, 1999, 673–676.