On Stochastic Acceptance and Gradual Cooperation in Voluntarily ... - Keio

(1)

On Stochastic Acceptance and Gradual

Cooperation in Voluntarily Repeated Prisoners’

Dilemma with No Information Flow ^∗

by

Takako Fujiwara-Greve Department of Economics

Keio University

2-15-45 Mita, Minatoku, Tokyo 108-8345 JAPAN E-mail: [email protected]

This version: October, 2007.

Abstract

There has been an accumulation of research on voluntarily repeated prisoners’ dilemma with no information flow. This paper provides a unified comparison of various systems to enforce cooperation, as the matching probability changes. In many cases, the loss of payoff by not being able to adjust actions until the next period turned out to be greater than the loss of payoff by not being able to adjust the acceptance probability. Therefore stochastic acceptance equilibria give higher payoffs than those of gradual cooperation equilibria. The cost associated with the discrete timing of decision making is a new observation that has not been noticed in ordinary repeated games.

Keywords: Voluntary partnership, cooperation, matching, timing.

JEL classification number: C 73

∗The author is grateful to Shigeo Muto and Mikio Nakayama for comments on an earlier draft.

(2)

1. INTRODUCTION

There has been an accumulation of research on voluntarily repeated games with no information flow.

In such games, players can choose whether to repeat the same game with the same partner or not, and if a partnership ends, players meet new partners without the knowledge of past actions of each other.

Since a deviator can run away without affecting his future partnerships, ordinary trigger strategies are not available to enforce non-myopic actions such as cooperation in a prisoners’ dilemma.

To enforce non-myopic actions by rational players from the beginning of a partnership, three mechanisms are known. One is to reduce the probability of starting a partnership (Fujiwara-Greve, 2002, and Eeckhout, 2006), even if they are matched in the random matching process. Another is gift-giving before a partnership starts (Carmichael and McLeod, 1997). These two mechanisms can be summarized as “initial sunk cost” mechanism because things happen before a partnership starts to reduce the payoffs. The third is asymmetric strategy distributions in which non-myopic strategies are balanced with strategies which play myopically initially (Fujiwara-Greve and Okuno-Fujiwara, 2007).

If full cooperation cannot be enforced from the beginning of a partnership, one resorts to “gradual cooperation” (Datta, 1996, Kranton, 1992a, and Fujiwara-Greve and Okuno-Fujiwara, 2007). A gradual cooperation strategy initially plays myopic actions but keeps the partnership, and it starts a non-myopic action if the partnership lasted for some periods.

All of these equilibria reduce the continuation payoff when one ends a partnership, which is the lifetime payoff to start from anew. It is then important to investigate which type of equilibrium is relatively more efficient. A partial answer was given by Fujiwara-Greve and Okuno-Fujiwara (2007) for Prisoners’ Dilemma. They showed that asymmetric equilibria (if they exist) are more efficient than symmetric equilibria of gradual cooperation. However, they did not compare equilibria with “initial sunk-cost” by random rejection or gift-giving with gradual cooperation equilibria.

In this paper we focus on symmetric equilibria of Prisoners’ Dilemma games and compare full cooperation equilibria of initial stochastic acceptance with gradual cooperation equilibria. Both have advantages and disadvantages. Stochastic acceptance equilibria require that there is a jointly observable random signal on which matched players coordinate whether to accept each other or not. Gradual cooperation equilibria do not need such a signal but, due to the discrete timing and discrete feasible actions of the game, players cannot adjust the payoffs in small increments.

Our findings are as follows. For low probabilities of matching, ending a partnership becomes a sufficient threat for defectors so that an optimal equilibrium accepts any match. Therefore both

(3)

types of equilibria coincide. For medium values of matching probabilities, a stochastic acceptance equilibrium gives a higher payoff than the most efficient gradual cooperation equilibrium, for a range of payoff parameters. The point is that the loss of payoff by not being able to adjust actions until the next period is greater than the loss of payoff by not being able to adjust the acceptance probability well. Analogously, if players can give costly gifts to each other before a partnership starts, the cost of gift can be adjusted continuously and thus gift-giving equilibria yield higher payoffs than those of gradual cooperation equilibria.

It is an advantage of the voluntarily repeated game formulation that we are able to consider the cost associated with the discrete timing of decision making. The model can be applied to many economic situations which are voluntary and repeatable, such as transactions and employment relationships.

2. MODEL

Suppose that there is a large population of identical players.¹ Each player lives forever over a discrete time horizonτ = 1,2, . . . and discounts the future payoffs by a factorδ∈(0,1).

The dynamic game proceeds as follows. At the beginning of τ = 1, all players are unmatched.

After τ ≥ 2, the configuration of unmatched players and players with a partner is determined by exogenous stochastic separation and endogenous decisions.

At the beginning of each period, unmatched players enter a random matching process. With probabilityp, a player is matched with another player. If a player does not get a match in this phase, he receives theunmatched payoff u and stays as an unmatched player until the period ends.

Matched players simultaneously decide whether to Accept the match (actiona) or Reject it (action r). Apartnershipforms if and only if both players choose actiona. If at least one of the players chooses actionr, they become unmatched for that period and receive the unmatched payoff.

Players in a partnership play a symmetric stage game and receive the payoff accordingly. In this paper we focus on two-action Prisoners’ Dilemma as the stage game.

P1 \P2 C D

C c, c `, g D g, ` d, d Table 1: Prisoners’ Dilemma

1It is straightforward to extend the model into a two-population model with an asymmetric 2-person stage game.

(4)

- time

period period

'

&

$

% Random

matching ^-

Matched and both accept

@@

@@ -

No match or at least one player rejects

'

&

$

%

PD Both maintain the partnership - JJ

JJ

JJJ - At least

one player ends

'

&

$

% Random

matching

@@

@@R Exogenous separation

-

@@

@@R next period '

&

$

% PD

'

&

$

% PD

Figure 1: Outline of the game

The payoff parameters are ordered as g > c > d > ` and 2c > g +`. The latter assumption implies that the symmetric action profile (C, C) is efficient. Assume that c > u. Otherwise there is no point in trying to achieve cooperation. We also assume thatu > d which means that maintaining a partnership with mutual defection forever is not rational.

The stage-game actions are observable only to the players in the same partnership. After the stage game, players in a partnership choose simultaneously whether to Maintain the partnership (actionm) or End it (action e). They can base their actions/decisions only on the partnership history, which is the sequence of realized pure actions within the current partnership. At the end of a period, there is a probability 1−q that the partnership ends for some exogenous reason, even if both partners chose to maintain the relationship.² The separated players become unmatched and the period ends. The length of a partnership is thus determined by the strategic decisions and the exogenous separation and is counted ast= 1,2, . . ..

The outline of the game is depicted in Figure 1. The game is of complete information.

Let H₁ = {∅} be the null history when the partnership is newly formed. For any t = 2,3, . . ., let H_t = ({C, D} × {C, D})^t−1 be the set of possible partnership histories until t-th period of a

2This makes it impossible for players to distinguish deviators and exogenously separated players.

(5)

partnership.³ A pure strategy of a player is a sequence of functions s = (f^A,{f_t^G}^∞_t=1,{f_t^C}^∞_t=1) for three phases as follows:

[Acceptance Phase]f^A∈ {a, r}is the choice of accepting a new match or rejecting a new match (based on the empty partnership history);

[Stage Game Phase] for each t = 1,2, . . ., f_t^G :H_t→ {C, D} is the action plan in the stage game in t-th period of a partnership, where the domain is the set of all partnership histories until the previous period; and

[Continuation Phase] for eacht= 1,2, . . .,f_t^C :H_t×({C, D} × {C, D})→ {m, e} is the continuation decision int-th period of a partnership based on a partnership history including the current period.

In the above formulation, we are assuming that a strategy does not depend on the calendar time τ (but may depend on the partnership length t) and a player uses the same sin any match he faces.

Let S be the set of all pure strategies and Σ = ∆(S×S) be the set of correlated strategy profiles, i.e., the set of all probability distributions overS×S. (However, we focus only on correlated action in the acceptance phase.) For each strategys, a continuation strategy ofsafter a partnership history is defined as the restricted strategy starting from that partnership history.

A strategy combination of all players (together with the exogenous separation and the random matching) determines a stochastic sequence of stage payoffs (either the payoff from the stage game or u) to each player. We assume that the objective of a player is to maximize the expected sum of the stochastic sequence of stage payoffs over the infinite horizon, with discounting:

U(σ) =E X∞

τ=1

δ^τ−1u(τ;σ),

where σ ∈ Σ is a correlated strategy profile and u(τ;σ) is either the unmatched payoff u or the expected payoff of some correlated action profile in the stage game induced by the strategy profileσ in periodτ. The expectation is taken over the correlation probabilities in σ and the random separation.

A commonly used equilibrium concept for extensive form games with imperfect information is se- quential equilibrium(e.g., Abreuet al., 1986, 1990, and Kandori and Matsushima, 1998). A sequential equilibrium consists of a strategy combination of all players and their belief probability measures over the decision nodes satisfying the following two conditions: the strategies must be optimal given the

3It is sufficient to consider that actions and decisions depend only on the stage game actions within the partnership. Even though players can observe the acceptance decisions and continuation decisions, only (a, a) and (m, m) will lead to the formation/continuation of a partnership, and thus different actions cannot be chosen based on different acceptance/continuation decision combinations.

(6)

beliefs (sequential rationality) and the beliefs must be consistent with the strategies (consistency) in the sense that there exists a sequence of completely mixed strategies and associated belief measures that converge to the equilibrium strategies and beliefs. (See Fudenberg and Tirole, 1990, and Kreps and Wilson, 1982.) Consistency ensures that the equilibrium strategies are optimal against small perturbations.

We concentrate on sequential rationality considerations and do not explicitly construct belief measures. This is because only partnership histories are payoff-relevant, and those are perfectly observed.

We call a strategy combination a sequential equilibrium if for each player and for each partnership history, his continuation strategy after the partnership history is optimal given the current partner’s continuation strategy and the strategy distribution in the society.

We focus on symmetric strategy combinations such that all players in the same population play the same strategy. There are two justifications for this focus. One is that the average payoff of an individual player coincides with the average payoff of the population, under a symmetric strategy combination. The other is that if it is an equilibrium, it can be interpreted as a “social standard of behavior” (Okuno-Fujiwara and Postlewaite, 1995). Most literature (except Fujiwara-Greve and Okuno-Fujiwara, 2007) on voluntarily repeated games focuses on symmetric strategy combinations as well.

Finally, we note the existence of autarky equilibria for this game, for any parameter.

REMARK 1. (Fujiwara-Greve, 2002) For any parameter combination of the model, there are sequential equilibria such that any player rejects any new match, with arbitary action-choice in the stage game.

Proof: If no one accepts a new match, there is no incentive to accept a match, since that is not going to make a new partnership. Therefore rejecting any match is (weakly) optimal.

3. STOCHASTIC ACCEPTANCE 3.1. When a Continuum of Random Variables are Available

In this subsection we assume that there is a continuum of random variables that can be used as a joint correlation device at the beginning of every period. Hence, for anyα∈(0,1), there is a random signal system such that all matched players jointly observe a signal (say,G) with probability α. An example is the “market condition”; if some publicly observable index is above some level, then the signal isG, and this threshold can be chosen arbitrarily in a continuum. One can interpret this section as an ideal

(7)

situation or a benchmark for the finite signal case 3.2. A stochastic acceptance strategy, denoted as s(α), is defined as follows.⁴

1. Accept a match if and only if a signal (say,G) which occurs with probability α is observed.

2. PlayC if and only if either the partnership history is empty or it consists of (C, C) only.

3. Maintain the partnership if and only if (C, C) is observed in the current period.

LetU be the value when a player is unmatched and about to enter the random matching process and V be the value when a player has a partner at the beginning of a period. If all players use s(α), the value functions of a player satisfy the following simultaneous equations.

U = p{αV + (1−α)(u+δU)}+ (1−p)(u+δU), V = c+δ{qV + (1−q)U}.

By solving the simultaneous equations, we obtain explicitly U = αpc+ (1−δq)(1−αp)u

(1−δ){1−δq(1−αp)}, (1)

V = {1−δ(1−αp)}c+δ(1−q)(1−αp)u

(1−δ){1−δq(1−αp)} . (2)

From (1) and (2), we have

V −(u+δU) = c−u

1−δq(1−αp) >0, (3)

and

V −U = (1−αp)(c−u)

1−δq(1−αp) >0. (4)

PROPOSITION 1. _(g−u)δq^g−c <1 if and only if there exists α ∈(0,1]such that the stochastic acceptance strategy s(α) played by all players is a sequential equilibrium.

Proof: Let us consider sequential rationality at each decision phase.

1. Acceptance decision: IfG signal is not observed, the matched player would reject so that your acceptance decision does not matter. IfGis observed, given that the matched player accepts, if you also accept the match, your continuation payoff isV. If you reject, the continuation payoff isu+δU. Hence acceptance is better from (3).

4Needless to say, there are many strategies which differ froms(α) at off-path nodes and give the same equilibrium payoff. We are usings(α) as a representative.

(8)

2. Stage game: If the players are in the stage game phase, it must be that either it is a newly formed partnership or no one has playedDin the partnership history. Thus the current partner would playC. If you choose C, the continuation payoff is V. If you choose D, the partnership ends and the continuation payoff isg+δU. By computation, C is better if and only if

V −(g+δU)≥0 ⇐⇒ δq(1−αp)≥ g−c

g−u. (5)

3. Continuation decision: If you observed a partnership history with D, the partner would end the partnership, so whether you choose m ore does not matter. If you observed a partnership history with only (C, C), choosingm yields the continuation payoff ofV, while choosing egives U. Hence maintaining is better from (4).

Therefore condition (5) is necessary and sufficient for the sequential equilibrium to hold. It can be arranged as

α≤

³

1− g−c (g−u)δq

´1 p.

To warrant that suchα >0 exists, it is necessary and sufficient that the term in the large brackets is positive, i.e.,

g−c

(g−u)δq <1. (6)

COROLLARY 1. (Fujiwara-Greve, 2002) There exist(p, δ, q, α)∈(0,1]×(0,1)×(0,1)×(0,1]such that s(α) played by all players is a sequential equilibrium.

Proof: Since the right hand side of (5) is strictly between 0 and 1, for sufficiently large δq and sufficiently smallαp, the inequality holds.

From now on we assume thatδq is sufficiently large so that (6) holds. Then for given (p, δ, q), the maximal (i.e., most efficient) acceptance probability is

α^∗(p) := min n³

1− g−c (g−u)δq

´1 p,1

o .

Among the parameters, the matching probabilitypis likely to be controllable by a policy. Therefore we focus on how the maximal equilibrium payoff changes aspchanges. As pincreases from 0 to 1, the maximal α^∗(p) decreases from 1 to 1−_(g−u)δq^g−c . This means that as the matching becomes easier, we need to reduce the probability of starting a new partnership in order to make a potential defector

(9)

0.2 0.4 0.6 0.8 1 12

13 14

p

Matching probability U

attained bys(1)

¾

attained by6 s(α^∗(p))

p^∗

Figure 2: Maximal payoff of stochastic acceptance equilibrium (Parameter values: g= 10, c= 6, u= 2, δ= 0.8, q= 0.7.)

wait, as a punishment. In other words, when the matching probability p is small, players can start cooperative relationships right away, because ending the partnership is a sufficient punishment. This is the logic of efficiency wage theory to motivate workers by the threat of firing. (Shapiro and Stiglitz, 1984.) Given (δ, q), the cricital value ofpthat makes α^∗(p)<1 is

p^∗= 1− g−c (g−u)δq.

Forp≤p^∗, the maximal symmetric equilibrium payoff (attained by s(1)) is U(s(1)) = pc+ (1−δq)(1−p)u

(1−δ){1−δq(1−p)}. By differentiation,

∂U

∂p = (1−δq)(c−u)

(1−δ){1−δq(1−p)}² >0.

Hence U(s(1)) is increasing in the matching probability p. For p > p^∗, the maximal symmetric equilibrium payoff is constant inp (since (5) is satisfied with equality), and it is

U(s(α^∗(p))) = δqg−(g−c)

δq(1−δ) . (7)

Therefore overall maximal (as pchanges) is also (7). See Figure 2.

Note that for ordinary repeated games, the maximal symmetric equilibrium payoff under discounting is _1−δ^c , which is strictly greater than (7) since

c

1−δ −δqg−(g−c)

δq(1−δ) = (1−δq)(g−c) (1−δ)δq >0.

(10)

Thus it is clear that there is a social cost when players can run away with no information flow. Note also that the matching probability cannot increase the payoff of players beyond (7). This is because the payoff must be kept not too large to prevent deviations. Hence, ironically, a government policy to increase the random matching probability may not improve the payoffs.

3.2. When Only Finite Random Variables are Available

Now suppose that we no longer have a continuum of random variables to correlate the initial acceptance probability. Instead, assume that there is a finite set A consisting of real numbers between 0 and 1 that can be the joint acceptance probability using some stochastic signal. An example is the ratio of “desirable characteristics” that both players want to see from the other, e.g., school background, gender, and so on. If a two-outcome random variable with probability structure (α,1−α) exists, we can use either of α or 1−α as the acceptance probability. Therefore the set of feasible acceptance probabilities can be expressed from the largest to the smallest as A ={α₁, α₂, . . . , α_K}, whereα_k ∈ (0,1) andα_k> α_k+1 for all k, and α_K = 1−α₁,α_K−1 = 1−α₂ and so on.

Moreover, if |A|= 1, then the unique element must be 0.5. If A containsα 6= 0.5, then at least one element is greater than 0.5. Therefore ifAis nonempty, there is an element in Awhich is not less than 0.5. This fact will be useful later.

In order to satisfy the equilibrium condition (5) and to have the acceptance probability as high as possible, the players need to use a declining step function of acceptance probability as the matching probability p increases. (See Figure 3 for the case of α₁ = 0.8, α₂ = 0.5, and α₃ = 1−α₁ = 0.2.) First, for eachk= 1,2, . . . , K, definep_α_k implicitly by

α_k=α^∗(p_α_k).

LetαK¯ be the smallest probability value in Asuch that p_α_K_¯ ≤1.

Let

ˆ α^∗(p) =





1 ifp≤p^∗ α₁ ifp^∗ < p≤p_α₁

α_k ifp_α_k−1 < p≤p_α_k fork= 2,3, . . . ,K.¯

Then for anyp≤p_α_K_¯, the acceptance probability is not more thanα^∗(p), satisfying (5), and it is as high as possible. Note that for p > p_α_K_¯ there is no α ∈ A that satsifies (5) to sustain a sequential equilibrium. (Still the autarky equilibria exist.) Within each interval (p_k−1, p_k], the equilibrium payoff is increasing inp. Atp=p_α_k for some k, the payoff hits the overall maximum, since the coordination probability satisfies the condition (5) with equality.

(11)

0.2 0.4 0.6 0.8 1 0.2

0.4 0.6 0.8 1

0.1 0.2 0.3 0.4 0.5 12

13 14

α^∗(p) acceptance prob.

p^∗ p

p0.8p0.5 p0.2

ˆ α^∗(p) PP

i

¡ª£

££°£

max payoff

p^∗ U(s(0.8))

AAU U(s(0.5))

U(s(0.2))

p0.8p0.5 p0.2

-

no coop eq.

p

Figure 3: Optimal strategy and maximal payoff when the feasible correlation probabilities are{0.8,0.5,0.2}

4. GRADUAL COOPERATION 4.1. Equilibrium

Next we consider strategies which initially playDbut maintain the partnership and eventually cooperate. This type of strategy was the focus of Fujiwara-Greve and Okuno-Fujiwara (2007) for two-action prisoners’ dilemma. In the literature of generalized prisoners’ dilemma, such as Datta (1996) and Kranton (1996a), the level of cooperation can be continuously increased over time, but in the two- action game as ours, this type of a strategy can be interpreted as a gradual cooperation strategy.

Lets⁰(T) be a gradual cooperation strategy which plays Din the first T periods of a partnership, as follows.

1. Accept any match.

2. If the partnership length ist≤T, playD and maintain the partnership if and only if (D, D) is observed in the current period. (The latter is for simplicity of computation.)

3. If the partnership length ist > T, play C and maintain the partnership if and only if (C, C) is observed in the current period.

With this type of strategy, the punishment strength is adjusted by the length of T and thus they can accept any match. Let us find the condition that s⁰(T) played by all players is a sequential equilibrium.

(12)

Let U⁰ be the value when a player is unmatched before the random matching process of a period begins andV⁰(t) be the value when a player is int-th period of a partnership. Then they satisfy the following simultaneous equations.

U⁰ = (1−p)(u+δU⁰) +pV⁰(0)

V⁰(0) = {1 +δq+· · ·+δ^(T⁻¹⁾q^(T⁻¹⁾}d+ (δ^Tq^T +· · ·)c +δ(1−q){1 +δq+δ²q²+· · · }U⁰,

= 1−δ^Tq^T

1−δq d+ δ^Tq^T

1−δqc+δ(1−q) 1−δq U⁰. By solving the simultaneous equations, we obtain explicitly

U⁰ = (1−δq)(1−p)u+p{(1−δ^Tq^T)d+δ^Tq^Tc}

(1−δ){1−δq(1−p)} . (8)

For general t,

V⁰(t) = {1 +δq+δ²q²+· · ·+δ^(T^−t−1)q^(T^−t−1)}d+{δ^(T^−t)q^(T^−t)+· · · }c +δ(1−q){1 +δq+δ²q²+· · · }U⁰

= 1−δ^(T^−t)q^(T^−t)

1−δq d+δ^(T^−t)q^(T^−t)

1−δq c+δ(1−q)

1−δq U⁰, for t≤T,

V⁰(t) = V⁰(T + 1) = (1 +δq+δ²q²+· · ·)c+δ(1−q)(1 +δq+δ²q²+· · ·)U⁰

= 1

1−δqc+δ(1−q)

1−δq U⁰, for t≥T+ 1.

Clearly,V⁰(t)> V⁰(t−1) for anyt≤T + 1, since there is less and less time to suffer from (D, D) as time passes in a partnership. Note that

V⁰(0)−(u+δU⁰) = δ^Tq^T(c−d)−(u−d) 1−δq(1−p) , and

V⁰(0)−U⁰= (1−p){δ^Tq^T(c−d)−(u−d)}

1−δq(1−p) . Therefore,V⁰(0)≥u+δU⁰ and V⁰(0)≥U⁰ hold if and only if

δ^Tq^T(c−d)≥(u−d). (9)

It then follows thatV⁰(t)> u+δU⁰ andV⁰(t)> U⁰ for anyt≥1.

PROPOSITION 2. For any T, s⁰(T) played by all players is a sequential equilibrium if and only if (9) and

(δq)^T(c−d)≤(u−d) +1−δq(1−p)

δpq {δq(g−u)−(g−c)} (10) hold simultaneously.

(13)

Proof: Let us consider sequential rationality at each decision phase.

1. Acceptance decision: Given that a newly matched player accepts, if you also accept, the continuation value is V⁰(0), while if you reject, the continuation value if u+δU⁰. Hence a player followss⁰(T) in the acceptance decision phase if (9) holds.

2. Stage game: Whent≤T, following s⁰(T) gives the continuation value ofV⁰(t), while a one-shot deviation to playC gives`+δU⁰ < u+δU⁰. Hence if (9) holds, followings⁰(T) is better. When t≥T + 1, following s⁰(T) givesV⁰(T + 1), while a one-shot deviation to play D givesg+δU⁰. By computation,

{V⁰(T+ 1)−(g+δU⁰)}(1−δq){1−δq(1−p)}

= {1−δq(1−p)}{δq(g−u)−(g−c)}+δpq(u−d)−δpq(δq)^T(c−d).

By rearranging terms, followings⁰(T) is better if (10) holds.

3. Continuation decision: If (D, D) is observed when t ≤ T, maintaining the partnership gives V⁰(t+ 1) as the continuation value, while ending gives U⁰ so that maintaining is better if (9) holds. Similarly, if (C, C) is observed when t ≥T + 1, maintaining is better if (9) holds. If a deviation is observed, the parnter would end the partnership, so your choice does not matter.

Let

h(p) := (u−d) + 1−δq(1−p)

δpq {δq(g−u)−(g−c)}.

Then (10) is expressed as (δq)^T(c−d)≤h(p). By differentiation,

∂h

∂p =−(1−δq){δq(g−u)−(g−c)}

δp²q ,

so thath is strictly decreasing inpunder the assumption (6). It approaches to u−dfrom the above, asp approaches to infinity. (See Figure 4.) Hence if (δq)^T(c−d)> u−d, then up to certain p, (10) holds. In particular,s⁰(0) played by all players is a sequential equilibrium forp≤p^∗, sinces⁰(0) is the same as the stochastic acceptance strategys(1) that accepts any match. Similarly,s⁰(1) played by all players is a sequential equilibrium ifδq(c−d)≥u−dandp≤p⁰₁ holds wherep⁰₁ satisfies

δq(c−d) =h(p⁰₁).

(14)

0.2 0.4 0.6 0.8 1 2

4 6 8 10 12

p

Matching probability h(p)

c−d

(δq)¹(c−d) u−d (δq)²(c−d)¡µ

p^∗

?

s⁰(1) is an eq. iffp≤p⁰₁

p⁰₁

s⁰(2) violates (9)6 forp≤p^∗,s⁰(0) is an equilibrium

?

Figure 4: Existence of gradual cooperation equilibrium as a function of p (Parameter values: g= 10, c= 6, u= 2, d= 0, δ= 0.8, q= 0.7.)

In general, let ¯T be the largest T that satisfy (9). For each T = 0,1, . . . ,T¯, definep⁰_T by (δq)^T(c−d) =h(p⁰_T).

Then, for any p ≤ p⁰_T, s⁰(T) played by all players is a sequential equilibrium. Since h is strictly decreasing,p⁰_T < p⁰_T₊₁ for any T. Therefore ifs⁰(T) is an equilibrium strategy, so is s⁰(T+ 1) as long as (9) is satisfied. We want to pick the shortest (i.e., most efficient)T which warrants an equilibrium.

Let T^∗ be the largest T such that p⁰_T <1. If T^∗ = ¯T, then s⁰(T^∗+ 1) cannot be an equilibrium.

(This is the case for Figure 4.) IfT^∗ <T¯,s⁰(T^∗+ 1) is the most efficient equilibrium forp > p⁰_T. In summary, asp changes, the maximal equilibrium payoff is a step function such as

U⁰(s⁰(0)) ⇐⇒ 0< p≤p⁰₀(=p^∗)

U⁰(s⁰(T)) ⇐⇒ p⁰_T₋₁< p≤p⁰_T ∀T = 1,2, . . . , T^∗, U⁰(s⁰(T^∗+ 1)) ⇐⇒ p⁰_T∗< p and T^∗+ 1≤T .¯

The maximal equilibrium payoff for the parameters of Figure 4 is depicted in Figure 5. Atp=p⁰_T for someT, the maximal payoff of symmetric gradual cooperation equilibrium is

U⁰(s⁰(T)) = 1

(1−δ){1−δq(1−p)}

h

(1−δq)(1−p)u +p

n

d+ (u−d) +1−δq(1−p)

δpq {δq(g−u)−(g−c)}

oi

= δqg−(g−c) δq(1−δ) ,

(15)

0.1 0.2 0.3 0.4 12

13 14

p

Matching probability max eq. payoff U⁰(s⁰(0))

U⁰(s⁰(1))

p^∗ p⁰₁

- no coop eq.

Figure 5: Maximal payoff of gradual cooperation equilibrium (Parameter values: g= 10, c= 6, u= 2, d= 0, δ= 0.8, q= 0.7.)

which is precisely the same as the maximal payoff (7) of correlated acceptance equilibrium.

Betweenp⁰_T₋₁ andp⁰_T, the maximal equilibrium payoff is increasing in pas long as (9) is satisfied, since

∂U⁰(s⁰(T))

∂p = (1−δq){(1−δ^Tq^T)d+δ^Tq^Tc−u}

(1−δ){1−δq(1−p)}² >0.

4.2. Comparison of Maximal Payoffs

To compare the maximal symmetric equilibrium payoffs, the gradual cooperation equilibrium is never more efficient than stochastic acceptance equilibrium with a continuum of available random variables, and it is as efficient as the latter when p≤ p^∗ or when p =p_T for some T. (See Figure 6.) Clearly, the reason is the discrete time in which players adjust the actions.

When the matching probability is low (p < p^∗), then the most efficient symmetric equilibrium is to accept all matches, start cooperating right away, and end as soon as defection is observed. The reason is simple: Since it is difficult to get a new match, ending the partnership is the sufficient punishment.

When it becomes easier to find a new match, the incentive system becomes tricky. If players can coordinate on arbitrary continuous stochastic variables (as we assumed in Section 3.1), stochastic acceptance is quite fine-tuned to adjust the probability of the next match, which loses least efficiency.

(16)

0.1 0.2 0.3 0.4 0.5 12

13 14

max eq. payoff U(s(1))

=U⁰(s⁰(0))

’

@ R

p^∗

max payoff withα^∗(p)

?

p0.8

U(s(0.8))

?

p0.5

U(s(0.5))

¢®¢

p⁰₁ U⁰(s⁰(1))

p0.2

U(s(0.2))

p

Figure 6: Comparison of maximal payoffs

If players cannot coordinate on so many stochastic variables, however, the relative efficiency of the two types of equilibria depends on the parameters. See Figure 6 for an example.

We can, however, derive a general result for at least medium values of the matching probability.

Recall that if the set Ais nonempty, there is a signal to coordinate on with probability not less than 0.5. (See the argument in Section 3.2.) This implies that the stochastic acceptance equilibrium is better than s⁰(1) equilibrium for some parameter cases, as the following proposition shows.

PROPOSITION 3. If A is nonempty, δq(c−d)> u−d, and

δq≤ (g−u)(u−d) + (c−u)(c−d)

(g−u)(c−d) , (11)

then there exist α ∈ A and p_α ∈ (p^∗, p⁰₁) such that the stochastic acceptance strategy s(α) played by all players is a sequential equilibrium and is more efficient than the optimal gradual cooperation equilibriums⁰(1) for any p∈(p^∗, p_α].

Proof: First, the condition for the acceptance probabilityαto makes(α) better thans⁰(1) is derived as follows.

U(s(α))≥U⁰(s⁰(1)) ⇐⇒ α≥ δq(c−d)−(u−d)

δpq(c−d) + (c−u) =:α(p).

Clearly,α is a decreasing function ofp.

(17)

Second, we show that for anyp≥p^∗,

α^∗(p)≥α(p) ⇐⇒ p≤p⁰₁. (12)

LetB :=δq(g−u)−(g−c). Let us derivep⁰₁ explicitly in parameters.

h(p⁰₁) = (u−d) +1−δq(1−p⁰₁)

δp⁰₁q {δq(g−u)−(g−c)}=δq(c−d)

⇐⇒ δp⁰₁q(u−d) +{1−δq(1−p⁰₁)}{δq(g−u)−(g−c)}=δ²q²p⁰₁(c−d)

⇐⇒ p⁰₁{δq(u−d)−δ²q²(c−d) +δ²q²(g−u)−δq(g−d)}=−(1−δq){δq(g−u)−(g−c)}

⇐⇒ p⁰₁= δq(g−u)−(g−c) δq{(g−u)−(c−d)}. By computation,

α^∗(p)≥α(p)

⇐⇒ B

δq(g−u) ≥p· δq(c−d)−(u−d) δpq(c−d) + (c−u)

⇐⇒ Bδpq(c−d) +B(c−u)≥δpq(g−u){δq(c−d)−(u−d)}

⇐⇒ δpq h

(g−u){δq(c−d)−(u−d)} −B(c−d) i

≤B(c−u)

⇐⇒ δpq h

(g−u)δq(c−d)−(g−u)(u−d)−δq(g−u)(c−d) + (g−c)(c−d) i

≤B(c−u)

⇐⇒ δpq h

{g−u)−(c−u)}(c−d)−(g−u)(u−d) i

≤B(c−u)

⇐⇒ δpq(c−u){(g−u)−(c−d)} ≤B(c−u)

⇐⇒ p≤ δq(g−u)−(g−c)

δq{(g−u)−(c−d)} =p⁰₁. Hence (12) is proved.

Third, we show that under the assumption (11),α(p^∗)≤1/2. By computation, α(p^∗) = (g−u){δq(c−d)−(u−d)}

(g−u){δq(c−d)−(c−d)}+ (c−u){(g−u) + (c−d)} ≤ 1 2,

⇐⇒ 2(g−u){δq(c−d)−(u−d)} ≤(g−u){δq(c−d)−(c−d)}+ (c−u){(g−u) + (c−d)}

⇐⇒ δq(g−u)(c−d)≤(g−u){2(u−d)−(c−d) + (c−u)}+ (c−u)(c−d)

⇐⇒ δq≤ (g−u)(u−d) + (c−u)(c−d) (g−u)(c−d) .

Finally, as we noted in Section 3.2., if the set of feasible signals is nonempty, there is at least one signal withα ≥0.5. Therefore for that α, α≥0.5≥α(p^∗). Hence for anyp > p^∗,α > α(p). Letp_α be defined byα=α^∗(p_α). Since α is strictly decreasing andα(p⁰₁) =α^∗(p⁰₁),p_α< p⁰₁. (See Figure 7.)

(18)

0.2 0.4 0.6 0.8 1 0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8

p^∗

α ^-

p_α p⁰₁

p α^∗(p)

α(p)

Figure 7: The existence of α

(Parameter values: g= 10, c= 6,u= 2,d= 0, δ= 0.8, andq = 0.7)

Under the assumption δq(c−d) ≥ u−d, s⁰(1) attains the maximal equilibrium payoff for p ∈ (p^∗, p⁰₁]. We have shown that, if (11) holds, for any p∈(p^∗, p_α], s(α) is a sequential equilibrium and U(s(α))> U⁰(s⁰(1)).

In terms of the payoff parameters only, we have the following sufficient condition for (11).

REMARK 2. If c−d≥g−u, then (11) holds for anyδ ∈(0,1)and q∈(0,1].

Proof: By computation,

(g−u)(u−d) + (c−u)(c−d) (g−u)(c−d) ≥1

⇐⇒ (g−u)(u−d) + (c−u)(c−d)≥(g−u)(c−d)

⇐⇒ (c−u)(c−d)≥(g−u)(c−d−u+d)

⇐⇒ c−d≥g−u.

Therefore, ifc−d≥g−u, the RHS of (11) is not less than 1, whileδqis less than 1, hence it holds.

Under the parameter values of the numerical example of Figure 6, α(p) takes the value between 0.31 (whenp=p^∗) and 0.25 (whenp=p⁰₁). Hence the stochastic acceptance equilibrium withα= 0.8 and 0.5 are more efficient than the most efficient gradual cooperation equilibrium,s⁰(1), but the one withα= 0.2 is not.

(19)

5. CONCLUDING REMARKS 5.1. Gift-exchange Equilibrium

Carmichael and McLeod (1997) consider a similar game to ours except that the matching probability is always 1 and players can simultaneously give costly gifts to each other when they are newly matched.

(They also consider evolutionary stability instead of rational equilibrium.) Even though players cannot adjust the probability of forming a partnership, the costly gift serves as a punishment for defectors who need to start a new partnership. Since the cost of the gift can be chosen from a continuum of values, it is a straightforward extension of our analysis to show that gift-giving equilibria give higher payoffs than gradual cooperation, for the same reason as the stochastic acceptance equilibria being more efficient.

5.2. Combining Stochastic Acceptance and Gradual Cooperation

When there is a limited set of random variables to correlate the initial acceptance probability, we can consider strategies that use a coordinated acceptance probability as well as gradual cooperation.

Suppose that players use the following strategy ˆs(α, T):

1. Accept a new match if and only if a signal that occurs with probabilityα∈(0,1) is observed.

2. If the partnership length ist≤T, playD and maintain the partnership if and only if (D, D) is observed in the current period.

3. If the partnership length ist > T, play C and maintain the partnership if and only if (C, C) is observed in the current period.

This type of strategies has two variables to choose, the acceptance probability α and the length of initial defection periodsT. For a ˆs(α, T) played by all players to be a sequential equilibrium, we just need to change p into αp in the analysis of Section 4.1. Hence the equilibrium condition (9) is the same and (10) becomes

(δq)^T(c−d)≤(u−d) +1−δq(1−αp)

δαpq {δq(g−u)−(g−c)}=: ˆh(α, p). (13) Sinceh was decreasing in p, ˆh is decreasing in bothα andp. Therefore, by reducingα, (13) becomes easier to satisfy, i.e., we may be able to reduce T in equilibrium or a cooperative equilibrium exists for a wider range ofp.

(20)

0.2 0.4 0.6 0.8 1 2

4 6 8 10 12

h(p) ˆh(0.5, p)

c−d

(δq)(c−d) u−d

p^∗ p_0.5 p⁰₁ p¯_0.5

p

¾ here ˆs(0.5,1) is optimal eq. - here ˆs(0.5,0) optimal

¡¡ ª

Figure 8: Higher payoff by combiningα and T choice.

For example, take the numerical example we have been using in figures and suppose thatα= 0.5 is available, as depicted in Figure 8. Note that the matching probability where (c−d) intersects with ˆh(0.5, p) is preciselyp_0.5.

REMARK 3. For any α∈(0,1), let p_α be defined by α=α^∗(p_α). Then c−d= ˆh(α, p_α).

Proof: As before, let B :=δq(g−u)−(g−c). On one hand, α=α^∗(p_α) is equivalent to αp_α= B

δq(g−u). On the other hand,c−d= ˆh(α, p_α) is equivalent to

c−d= (u−d) +1−δq(1−αp_α) δqαp_α B

⇐⇒ c−u= B{1−δq(1−αp_α)}

δqαp_α

⇐⇒ αp_α{δq(c−u)−δqB}=B(1−δq)

⇐⇒ αp_α= B δq(g−u).

The intuition is thatα=α^∗(p_α) means that ˆs(α,0) barely satisfies the condition (5) to be sequen- tially optimal andc−d= ˆh(α, p_α) also means the same thing.

(21)

Therefore, forp∈(p^∗, p_0.5), ˆs(0.5,0) played by all players is a sequential equilibrium and is better than the optimal gradual cooperation equilibrium s⁰(1), as we have shown in Section 4.2. In this region ofp, therefore, we are able to reduceT by reducingα. Forp > p_0.5, we have a wider range ofp up to ¯p_0.5 under which a cooperative sequential equilibrium exists. In particular, whenp∈(p⁰₁,p¯_0.5], there was no cooperative equilibrium before, but now ˆs(0.5,1) is a cooperative equilibrium. Thus by combining stochastic acceptance and gradual cooperation, we can improve the equilibrium payoff.

5.3. Discussion

First, let us interpret the results. When matching is difficult (p ≤ p^∗), ending the partnership is a sufficient punishment so that both mechanisms of stochastic acceptance and gradual cooperation yield the same optimal equilibrium (to accept any match and to cooperate from the beginning of a partnership). This may be the case in Japanese labor market during the 60-70’s. The job market was limited to new graduates of schools and both firms and workers cooperated from the beginning.

When matching is not so difficult, the equilibrium payoff depends on how well the strategy adjusts to the matching situation. For medium values of p, playing (D, D) once at the beginning of every partnership can be too harsh, as compared to everyone rejecting a match with some probability.

Since this result holds for a range of payoff parameters, the impact of discrete timing is large.

However, we often see that players adjust actions at a fixed interval. Wage negotiations occur at a fixed time interval, promotions occur at a fixed time of a year, and so on. Hence discrete timing game is prevalent and we should be aware of the cost of it. This observation was never seen in ordinary games where matching and incentives to continue a game were not an issue.

Second, let us consider cases in which gradual cooperation is more efficient than stochastic acceptance in complete information games.⁵ If players are very patient (largeδq) anddis not small, (11) may be violated so that gradual cooperation becomes better for medium p. This is because patient players do not mind the initial (D, D).

Another case is learning by doing. If players can learn to play a game better over time (to formalize, the stage game payoff may be increased as the partnership becomes longer), then strategies that start a partnership as soon as possible can be better than those not starting sometimes. This may be the reason that in sports and craftsmanship, a typical strategy is gradual cooperation instead of stochastic initial acceptance.

5In incomplete information games it is clear that gradual cooperation strategy is useful to distinguish myopic types from non-myopic types, as shown in Ghosh and Ray (1996).

(22)

REFERENCES

Abreu, D. (1988). “On the Theory of Infinitely Repeated Games with Discounting,” Econometrica 56, 383-396.

Abreu, D., Pearce, D., and Stacchetti, E. (1986). “Optimal Cartel Equilibria with Imperfect Moni- toring,”Journal of Economic Theory39, 251-269.

Abreu, D., Pearce, D., and Stacchetti, E. (1990). “Toward a Theory of Discounted Repeated Games with Imperfect Monitoring,”Econometrica58, 1041-1064.

Carmichael L., and B. McLeod (1997): “Gift Giving and the Evolution of Cooperation,”International Economic Review, 38, 485-509.

Datta, S. (1996). “Building Trust,” Discussion Paper No. TE/96/305, March 1996, London School of Economics and Political Science.

Eeckhout, J. (2006). “Minorities and Endogenous Segregation,” Review of Economic Studies 73, 31-53.

Ellison, G. (1994). “Cooperation in the Prisoner’s Dilemma with Anonymous Random Matching,”

Review of Economic Studies61, 567-588.

Fudenberg, D., Levine, D., and Maskin, E. (1994). “The Folk Theorem in Repeated Games with Imperfect Public Information,”Econometrica62, 997-1039.

Fudenberg, D., and Maskin, E. (1986). “The Folk Theorem in Repeated Games with Discounting or with Incomplete Information,”Econometrica54, 533-556.

Fudenberg, D., and Maskin, E. (1991). “On the Dispensability of Public Randomization in Discounted Repeated Games,”Journal of Economic Theory53 , 428-438.

Fudenberg, D., and Tirole, J. (1990). Game Theory. Boston MA: MIT Press.

Fujiwara-Greve, T. (2002). “On Voluntary and Repeatable Partnerships under No Information Flow,”

Proceedings of the 2002 North American Summer Meetings of the Econometric Society.

(http://www.dklevine.com/proceedings/game-theory.htm)

Fujiwara-Greve, T., and Okuno-Fujiwara, M. (2007). “Voluntarily Separable Repeated Prisoner’s Dilemma,” manuscript, Keio University and University of Tokyo.

Ghosh, P., and Ray, D. (1996). “Cooperation in Community Interaction without Information Flows,”

Review of Economic Studies63, 491-519.

(23)

Kandori, M. (1992). “Social Norms and Community Enforcement,” Review of Economic Studies59, 63-80.

Kandori, M., and Matsushima, H. (1998). “Private Observation, Communication and Collusion,”

Econometrica66, 627-652.

Kranton, R. (1996a). “The Formation of Cooperative Relationships,”Journal of Law, Economics &

Organization12, 214-233.

Kranton, R. (1996b), “Reciprocal Exchange: A Self-Sustaining System,”American Economic Review, 86, 830-851.

Kreps, D., and Wilson R. (1982). “Sequential Equilibrium,”Econometrica50, 863-894.

Lagunoff, R., and Matsui, A. (2001). “Organizations and Overlapping Generations Games: Memory, Communication, and Altruism,” mimeo. Georgetown University and University of Tokyo.

Matsushima, H. (1990). “Long-term Partnership in a Repeated Prisoner’s Dilemma with Random Matching,”Economics Letters34, 245-248.

Okuno-Fujiwara, M. (1987): “Monitoring Cost, Agency Relationship, and Equilibrium Modes of Labor Contract,”Journal of Japanese and International Economies, 1, 147-167.

Okuno-Fujiwara, M., and Postlewaite A. (1995). “Social Norms and Random Matching Games,”

Games and Economic Behavior9, 79-109.

Rob, R., and Yang, H. (2005): “Long-Term Relationships as Safeguards,” mimeo., University of Pennsylvania.

Shapiro, C. and Stiglitz, J. (1984). “Equilibrium Unemployment as a Worker Discipline Device,”

American Economic Review 74, 433-444.

Watson, J. (2002). “Starting Small and Commitment,” Games and Economic Behavior,38, 176-199.

On Stochastic Acceptance and Gradual Cooperation in Voluntarily ... - Keio