# Bayesian game

From Wikipedia the free encyclopedia

In game theory, a Bayesian game is a game in which players have incomplete information about the other players. For example, a player may not know the exact payoff functions of the other players, but instead have beliefs about these payoff functions. These beliefs are represented by a probability distribution over the possible payoff functions.

John C. Harsanyi describes a Bayesian game in the following way.[1] Each player in the game is associated with a set of types, with each type in the set corresponding to a possible payoff function for that player. In addition to the actual players in the game, there is a special player called Nature. Nature randomly chooses a type for each player according to a probability distribution across the players' type spaces. This probability distribution is known by all players (the "common prior assumption"). This modeling approach transforms games of incomplete information into games of imperfect information (in which the history of play within the game is not known to all players).

Incompleteness of information means that at least one player is unsure of the type (and therefore the payoff function) of another player. Such games are called Bayesian because players are typically assumed to update their beliefs according to Bayes' rule. In particular, the belief a player holds about another player's type might change according to his own type.

## Specification of games

In a Bayesian game, one has to specify type spaces, strategy spaces, payoff functions and prior beliefs. A strategy for a player is a complete plan of action that covers every contingency that might arise for every type that player might be. A type space for a player is just the set of all possible types of that player. The beliefs of a player describe the uncertainty of that player about the types of the other players. Each belief is the probability of the other players having particular types, given the type of the player with that belief. A payoff function is a function of strategy profiles and types.

Formally, such a game is given by:[2] ${\displaystyle G=\langle N,\Omega ,p,\langle A_{i},u_{i},T_{i},\tau _{i}\rangle _{i\in N}\rangle }$, where

1. ${\displaystyle N}$ is the set of players.
2. ${\displaystyle \Omega }$ is the set of states of nature.
3. ${\displaystyle A_{i}}$ is the set of actions for player ${\displaystyle i}$. Let ${\displaystyle A=A_{1}\times A_{2}\times \dotsb \times A_{N}}$.
4. ${\displaystyle T_{i}}$ is the set of types for player ${\displaystyle i}$. Given the state, the type of player ${\displaystyle i}$ is given by the function ${\displaystyle \tau _{i}\colon \Omega \rightarrow T_{i}}$. So, for each state of nature, the game will have different types of players.
5. ${\displaystyle u_{i}\colon T_{i}\times A\rightarrow \mathbb {R} }$ is the payoff function for player ${\displaystyle i}$.
6. ${\displaystyle p}$ is the (prior) probability distribution over ${\displaystyle \Omega }$.

A pure strategy for player ${\displaystyle i}$ is a function ${\displaystyle s_{i}\colon T_{i}\rightarrow A_{i}}$. A mixed strategy for player ${\displaystyle i}$ is a function ${\displaystyle \sigma _{i}\colon T_{i}\rightarrow \Delta A_{i}}$, where ${\displaystyle \Delta A_{i}}$ is the set of all probability distributions on ${\displaystyle A_{i}}$. Note that a strategy for any given player only depends on his own type.

A strategy profile ${\displaystyle \sigma }$ is a strategy for each player. A strategy profile determines expected payoffs for each player, where the expectation is taken over both the set of states of nature (and hence profiles of types) with respect to beliefs ${\displaystyle p}$, and the randomization over actions implied by any mixed strategies in the profile ${\displaystyle \sigma }$.

## Bayesian Nash equilibrium

In a non-Bayesian game, a strategy profile is a Nash equilibrium if every strategy in that profile is a best response to every other strategy in the profile; i.e., there is no strategy that a player could play that would yield a higher payoff, given all the strategies played by the other players.

An analogous concept can be defined for a Bayesian game, the difference being that every player's strategy maximizes his expected payoff given his beliefs about the state of nature. A player's beliefs about the state of nature are formed by conditioning the prior probabilities ${\displaystyle p}$ on his own type according to Bayes' rule.

A Bayesian Nash equilibrium (BNE) is defined as a strategy profile that maximizes the expected payoff for each player given their beliefs and given the strategies played by the other players. That is, a strategy profile ${\displaystyle \sigma }$ is a Bayesian Nash equilibrium if and only if for every player ${\displaystyle i,}$ keeping the strategies of every other player fixed, strategy ${\displaystyle \sigma _{i}}$ maximizes the expected payoff of player ${\displaystyle i}$ according to his beliefs.[2]

For finite Bayesian games, i.e., both the action and the type space are finite, there are two equivalent representations. The first is called the agent-form game (see Theorem 9.51 of the Game Theory book[3]) which expands the number of players from ${\displaystyle |N|}$ to ${\textstyle \sum _{i=1}^{|N|}|T_{i}|}$, i.e., every type of each player becomes a player. The second is called the induced normal form (see Section 6.3.3 of Multiagent Systems[4]) which still has ${\displaystyle |N|}$ players yet expands the number of each player i's actions from ${\displaystyle |A_{i}|}$ to ${\textstyle |A_{i}|^{|T_{i}|}}$, i.e., the pure policy is a combination of actions the player should take for different types. We can compute Nash Equilibrium (NE) in these two equivalent representations and then recover the BNE from the NE.

• If we further consider two players with a zero-sum objective function, then we can form a linear program to compute BNE.[5]
• If we further consider two players with a general-sum objective function, then we can form a math program of a bi-linear objective function and linear constrains to compute BNE (see Theorem 1 of the paper[6]). Although no nonlinear solver can guarantee the global optimal, we can check whether a BNE is achieved by looking at the value of the program's objective function, which should equal 0 at a BNE. Thus, the value closer to 0 represents a smaller error, which can serve as a metric of the solution quality. We can directly extend the above math programming method to N-player games.

## Variants of Bayesian equilibrium

### Perfect Bayesian equilibrium

Bayesian Nash equilibrium can result in implausible equilibria in dynamic games, where players move sequentially rather than simultaneously. As in games of complete information, these can arise via non-credible strategies off the equilibrium path. In games of incomplete information there is also the additional possibility of non-credible beliefs.

To deal with these issues, Perfect Bayesian equilibrium, in the spirit of subgame perfect equilibrium requires that, starting from any information set, subsequent play be optimal. Furthermore, it requires that beliefs be updated consistently with Bayes' rule on every path of play that occurs with positive probability.

### Stochastic Bayesian games

The definition of Bayesian games has been combined with stochastic games to allow for environment states (e.g. physical world states) and stochastic transitions between states.[7] The resulting "stochastic Bayesian game" model is solved via a recursive combination of the Bayesian Nash equilibrium and the Bellman optimality equation.

### Incomplete information over collective agency

The definition of Bayesian games and Bayesian equilibrium has been extended to deal with collective agency. One approach is to continue to treat individual players as reasoning in isolation, but to allow them, with some probability, to reason from the perspective of a collective.[8] Another approach is to assume that players within any collective agent know that the agent exists, but that other players do not know this, although they suspect it with some probability.[9] For example, Alice and Bob may sometimes optimize as individuals and sometimes collude as a team, depending on the state of nature, but other players may not know which of these is the case.

## Example

### Sheriff's Dilemma

A sheriff faces an armed suspect. Both must simultaneously decide whether to shoot the other or not.

The suspect can either be of type "criminal" or type "civilian". The sheriff has only one type. The suspect knows its type and the Sheriff's type, but the Sheriff does not know the suspect's type. Thus, there is incomplete information (because the suspect has private information), making it a Bayesian game. There is a probability p that the suspect is a criminal, and a probability 1-p that the suspect is a civilian; both players are aware of this probability (common prior assumption, which can be converted into a complete-information game with imperfect information).

The sheriff would rather defend himself and shoot if the suspect shoots, or not shoot if the suspect does not (even if the suspect is a criminal). The suspect would rather shoot if he is a criminal, even if the sheriff does not shoot, but would rather not shoot if he is a civilian, even if the sheriff shoots. Thus, the payoff matrix of this Normal-form game for both players depends on the type of the suspect. It is assumed that payoffs are given as follows:

Type = "Civilian" Sheriff's action
Shoot Not
Suspect's action Shoot -3, -1 -1, -2
Not -2, -1 0, 0

Type = "Criminal" Sheriff's action
Shoot Not
Suspect's action Shoot 0, 0 2, -2
Not -2, -1 -1,1

If both players are rational and both know that both players are rational and everything that is known by any player is known to be known by every player (i.e. player 1 knows player 2 knows that player 1 is rational and player 2 knows this, etc. ad infinitumcommon knowledge), play in the game will be as follows according to perfect Bayesian equilibrium:[10][11]

When the type is "civilian", the dominant strategy for the suspect is not to shoot, and when the type is "criminal", the dominant strategy for the suspect is to shoot; alternative strictly dominated strategy can thus be removed. Given this, if the sheriff shoots, he will have a payoff of 0 with probability p and a payoff of -1 with probability 1-p, i.e. an expected payoff of p-1; if the sheriff does not shoot, he will have a payoff of -2 with probability p and a payoff of 0 with probability 1-p, i.e. an expected payoff of -2p. Thus, the Sheriff will always shoot if p-1 > -2p, i.e. when p > 1/3.

## References

1. ^ Harsanyi, John C., 1967/1968. "Games with Incomplete Information Played by Bayesian Players, I-III." Management Science 14 (3): 159-183 (Part I), 14 (5): 320-334 (Part II), 14 (7): 486-502 (Part III).
2. ^ a b Kajii, A.; Morris, S. (1997). "The Robustness of Equilibria to Incomplete Information". Econometrica. 65 (6): 1283–1309. doi:10.2307/2171737.
3. ^ Maschler, Michael; Solan, Eilon; Zamir, Shmuel (2013). Game Theory. Cambridge: Cambridge University Press. doi:10.1017/cbo9780511794216. ISBN 978-0-511-79421-6.
4. ^ Shoham, Yoav; Leyton-Brown, Kevin (2008). Multiagent Systems. Cambridge: Cambridge University Press. ISBN 978-0-511-81165-4.
5. ^ Ponssard, J. -P.; Sorin, S. (June 1980). "The LP formulation of finite zero-sum games with incomplete information". International Journal of Game Theory. 9 (2): 99–105. doi:10.1007/bf01769767. ISSN 0020-7276.
6. ^ Huang, Linan; Zhu, Quanyan (2020-02-01). "A dynamic games approach to proactive defense strategies against Advanced Persistent Threats in cyber-physical systems". Computers & Security. 89: 101660. arXiv:1906.09687. doi:10.1016/j.cose.2019.101660. ISSN 0167-4048.
7. ^ Albrecht, Stefano; Crandall, Jacob; Ramamoorthy, Subramanian (2016). "Belief and Truth in Hypothesised Behaviours". Artificial Intelligence. 235: 63–94. arXiv:1507.07688. doi:10.1016/j.artint.2016.02.004.
8. ^ Bacharach, M. (1999). "Interactive team reasoning: A contribution to the theory of cooperation". Research in Economics. 53: 117–47. doi:10.1006/reec.1999.0188.
9. ^ Newton, J. (2019). "Agency equilibrium". Games. 10 (1). doi:10.3390/g10010014.
10. ^ "Coursera". Coursera. Retrieved 2016-06-16.
11. ^ Hu, Yuhuang; Loo, Chu Kiong (2014-03-17). "A Generalized Quantum-Inspired Decision Making Model for Intelligent Agent". The Scientific World Journal. 2014. doi:10.1155/2014/240983. ISSN 1537-744X. PMC 3977121. PMID 24778580.