• 検索結果がありません。

ミクロ経済学II(大学院修士課程) Ryuji Sano repeated games v2

N/A
N/A
Protected

Academic year: 2018

シェア "ミクロ経済学II(大学院修士課程) Ryuji Sano repeated games v2"

Copied!
18
0
0

読み込み中.... (全文を見る)

全文

(1)

3. Repeated Games

(and Long-term Relationship)

• The same game is repeatedly played by the same players

• A theory of long-term relationship

• The past outcome (history) is

• Completely observable: Perfect monitoring

• Not observable completely: Imperfect monitoring

Players commonly observe something: Imperfect public monitoring

Players privately observe something: Imperfect private monitoring

• Can people cooperate with each other when they are in a

long-term relationship?

Stage game

(2)

3.1. Finitely Repeated Games

E.g., Prisoners’ Dilemma

Stage game actions: = �, � , � = �1 ×2

Stage payoff 1,2

PD is played T times (2 ≤ � < ∞)

History at period t: = 1,2, … ,�−1 ∈ � = �−1, (1 = ∅ )

• (Pure) strategy of player i: for each t,

: → �

= is an action taken at period t under history

Mixed and behavior strategies are well defined, but we do not consider here

• Total payoff: � = ∑�=1

1 / 2 C D

C 2, 2 -1, 3

D 3, -1 0, 0

(3)

Pure Strategy in Repeated Games

繰り返しゲームの各t期(の初め)において、各プレーヤーは「t-1期までに何が 起きたか」を観察している

t期のある(一つの)履歴(historyとは、「t-1期までに何が起きたか」を記 述したもの:

=1,2, … ,�−1

= �∈�は第s期にプレイされた行動の組を表す

t期の初めにあり得る全てのhistoryの集合:

= �−1 ( ∈ �, s=1,…,t-1なので、は【「t-1個の行動組� ∈ �の列」の集合】の 意味)

プレーヤーiは、それまでのあり得る全てのhistory ∈ �に対して、第t期にプ レイする行動 ∈ �を決定: = ∈ �

プレーヤーiの戦略:全ての期t=1,2,…について、そのときあり得る全てのhistory ∈ �の下で第t期にプレイする行動を決定

Player i’s action plan at period t: : → �

Player i’s strategy

= �=1 (=1,2,3, … )

(4)

SPNE

• Backward induction

• Period T:

• Period T-1:

• Period T-2:

• And so on

• Therefore, (D,D) is played for all T periods in a unique SPNE

• Simply one-shot PD

• Regardless of ∈ �, (D,D) is a unique Nash eqm

• Regardless of �−1 ∈ ��−1 and the current outcome, (D,D) is played at period T

• Same as the case where T-1 is the last period

• (D,D) is played as a unique Nash eqm

• Repeat the same consideration

• Regardless of �−2 ∈ ��−2, (D,D) is a unique Nash eqm

Cooperation (C,C) is not achieved in finitely repeated PD…

(5)

Proposition 3.1.1. In finitely repeated games, it is a subgame perfect Nash equilibrium to play a stage Nash equilibrium in each period. In addition, if there is a unique Nash equilibrium in the stage game, repetition of stage Nash equilibrium is a unique subgame perfect Nash equilibrium.

• Repetition of stage NE is a trivial equilibrium

In infinitely repeated games, cooperation can be sustained even if stage Nash is unique!

• When there are multiple NE in the stage game, we can construct a non- trivial SPNE such that stage NE is not taken

“Carrot and stick”

We learn this in homework

(6)

3.2. Infinitely Repeated Games

• Infinitely repeated PD

Same definition regarding stage game

Same definition for history ∈ � (� = 1,2, …) and strategy

• Discount factor � ∈ 0,1

• Payoff is defined as the discounted sum:

� = �

�=1

�−1

Sometimes average payoff is used:

� = 1 − � �

When a player earns a constant stage payoff �� for all t, the associated payoff is 1 + � + �2 +⋯ �� = ��

1 − �

Hence, when the discounted sum is � , the average payoff � per period is

1− � =

(7)

• Discount factor

• Represents time preference

• Probability that the game continues

With probability 1− �, either player dies

• When interest rate is � > 0, present value of future payoff is discounted by � = 1

1+�

• Large �: more patient

(8)

• In infinitely repeated games, we cannot use backward induction, because the “end of the game” does not exist

However, we can solve due to the recursive structure of repeated games

• The subgame after one-period the stage game is the original game

(9)

3.2.1. Cooperation

• In infinitely repeated PD, (C,C) can be achieved by an SPNE!

Definition 3.2.1. In the grim-trigger strategy �� in repeated PD games, (1) �� 1 = �, and

(2) �� = �� if = �, � , �, � , … , �, �

otherwise

In the grim-trigger strategy, a player starts with cooperation.

Keep choosing C whenever no one (including myself) has chosen defection

Once someone chooses D, the player turns to D and plays D forever

• When both players take the grim-trigger strategy, (C,C) is played in each period

(10)

Proposition 3.2.1. Suppose � ≥ 1/3. In the (infinitely) repeated prisoners’ dilemma, the profile of grim-trigger strategies is a subgame perfect Nash equilibrium.

Proof.

Suppose that player 2 takes the grim-trigger strategy.

If player 1 also takes the grim-trigger strategy, his payoff is 2/(1-δ). Suppose that P1 deviates and takes D at period t. Then P1 earns stage payoff 1 = 3 at t

Because P2 chooses D for all ≥ � + 1, it is optimal for P1 to choose D for all ≥ � + 1 too.

Because the realized payoff is the same up to � − 1, the deviation is not profitable if

�−1 3 + � ⋅ 0 + �2 ⋅ 0 + ⋯ ≤ ��−1 2 + � ⋅ 2 + �2 ⋅ 2 + ⋯

⇔ � ≥ 13

(11)

3.2.2. One Shot Deviation Principle

• In repeated games, it looks hard to check whether a strategy profile is an SPNE, because there are infinitely many strategies (even if stage game is very simple)

• However, in fact, it is not so hard. We can check it without considering all the possible strategies: One shot deviation principle (1回逸脱の原)

Definition 3.2.2. For every period t, the continuation payoff (継続利得)

� ℎ is a total payoff under a specified strategy profile in a subgame starting at t:

� ℎ =

�=�

�−�

(12)

戦略組� ∈ �のとき、第t期にプレイされる行動組:

= � ℎ =� �∈�

• Continuation payoff

� = �1 + ��2 + ⋯ + ��−1 +�+1 +�+1�+2 +

= 1 + ⋯ ��−2 �−1 + �−1 + �� �+1 + 2 �+2 +

≡ �

� ℎ

history

Continuation payoff

(13)

Definition 3.2.3. In repeated games, a strategy profile is a subgame perfect Nash equilibrium if for all , all t, and all ,

≥ �,−� .

• Fix any strategy profile other than i, −�

• Under −� , we want to verify whether is optimal or not

• To check this, we do NOT have to consider all possible strategies

Proposition 3.2.2 [one shot deviation principle]. Fix any −� . A strategy is optimal if player i is not better off deviating only at period � and going back to from period t+1.

が最適かどうか確かめるには、t期に1回だけから逸脱したときに 得をしないかどうかをチェックすれば十分

(14)

Proof. (sketch)

• Fix −� and consider any

• Let =

• Let = ,−� , where is any strategy

• Let = ,−� , where is any “k-shot deviation” of

Deviation at periods �, � + 1, … , � + � − 1 only

• We want to show [ ≥ �1 ⇒ � ≥ �]

• Suppose ≥ �1. Then, we have

≥ �1 ≥ �2 ≥ �3 ≥ ⋯

The first inequality is by assumption

The second inequality follows by applying the assumption after the deviation at period t (“�+1”)

The third inequality follows by applying the assumption after the deviation at periods t and t+1 (“�+2”), and so on

Thus, any finite-period-shot deviation is not profitable

• To complete the proof, we need to consider the case of “infinite k”

“Infinite-period deviation” is almost the same as “k-shot deviation” with large k, because future payoff is discounted by �−1 ≈ 0 and negligible

(15)

3.2.3. The Folk Theorem

• Infinitely repeated games with the general stage game:

� = 1, … , �

Player i’s action ∈ �. Action profile � ∈ �

Stage payoff function

Stage Nash equilibrium

Proposition 3.2.3 [Folk Theorem]. Suppose that an action profile � ∈ � satisfies � > � for all � ∈ �. If discount factor � is sufficiently close to 1, there exists a subgame perfect Nash equilibrium such that the profile of average payoffs is �� = �1 � , … , �

• If players are sufficiently patient (� is large), any stage outcome better than stage NE can be sustainable in an SPNE

(16)

• In infinitely repeated games, there are so many SPNE

• Basically, almost everything can be supported in an SPNE

For any � ∈ 0,1 , we can construct an SPNE that induces the average payoffs of

�� = �� � + 1 − � � �

E.g. (repeated PD): By taking (C,C) in every odd period and (D,D) in every even period, players achieve (almost) the “half cooperation” (� = 1/2)

• In fact, even an average payoff such that �� < � � can be sustained

(17)

3.2.4. Tit-for-Tat Strategy

• Axelrod (1984)

• Repeated PD

Definition 3.2.4. In the tit-for-tat strategy �� in repeated PD games, (1) �� 1 =

(2) �� = �−1 for � ≥ 2 (the action taken by the other player in the previous period)

• In Axelrod’s computer simulations, various strategies are run in tournament. Tit-for-tat finally won.

(18)

Proposition 3.2.4. In (generic) repeated PD games, the profile of tit-for-tat strategies is NOT a subgame perfect Nash equilibrium.

• Under the specification of sec 3.1, the profile of tit-for-tat is a subgame perfect Nash equilibrium if and only if � = 1/3

• Proof is homework

参照

関連したドキュメント

In addition, this invariant is used to show that many symplectic 4–manifolds have nontrivial homology classes which are represented by infinitely many pairwise inequivalent

Indeed, if we use the indicated decoration for this knot, it is straightforward if tedious to verify that there is a unique essential state in dimension 0, and it has filtration

鈴木 則宏 慶應義塾大学医学部内科(神経) 教授 祖父江 元 名古屋大学大学院神経内科学 教授 高橋 良輔 京都大学大学院臨床神経学 教授 辻 省次 東京大学大学院神経内科学

⑹外国の⼤学その他の外国の学校(その教育研究活動等の総合的な状況について、当該外国の政府又は関

We will study the spreading of a charged microdroplet using the lubrication approximation which assumes that the fluid spreads over a solid surface and that the droplet is thin so

If the Krull dimension is at least 2, then there are infinitely many prime ideals P of height 1 such that (Λ/P ) is also a formal power series ring, but with Krull dimension reduced

1991 年 10 月  桃山学院大学経営学部専任講師 1997 年  4 月  桃山学院大学経営学部助教授 2003 年  4 月  桃山学院大学経営学部教授(〜現在) 2008 年  4

Extensional P-completeness is very easy to achieve: it is basically sufficient if the following are typable:.. 17/03/2006, Keio