Chassang2010 (Building routines: Learning, cooperation, and the dynamics of incomplete relational contracts)
Model
Mechanics
Two agents, infinite horizon, discrete time, and each agent has a common discount factor Ξ΄.
Each period agent 1 can choose to enter or exit.
Player 2 has a countably finite set of actions, represented as the set π, which is equal to the natural numbers. Each period an i.i.d subset of Aβββπ is drawn. This is both the state of the economy in each period and agent 2βs feasible actions set for the period. Each action is aβββπ is drawn with the same probability p each period.
Each period t is broken into two stages. In stage 1, agent 1 chooses to enter or exit. Upon exit, both players get 0 flow payoffs, the game moves to period tβ +β 1.
If agent 1 chooses to enter, he pays a constant fixed cost k while agent 2 gets a benefit Ο. Then the iid state Atβββπ is drawn and stage 2 begins.
In stage 2 player 2 chooses his action aβββAt. This choice brings a deterministic cost c(a). With probability q the action succeeds and agent 1 gains a deterministic payoff b(a). With probability 1β ββ q the action fails and agent 1 gets 0.
Technology
In the set π there are N productive actions (numbered 0 to Nβ ββ 1).
The function c(a) takes on a constant value c for all productive actions, and 0 for all non-productive actions.
The function b(a) is positive (though not constant) for all productive actions, and 0 for all other actions.
There is no mechanism for agents to share utility.
Information
All agents know perfectly the parameters Ξ΄ (discount rate), p (probability at which a particular action is drawn from π each period), q (probability a productive action is successful), and N the total number of productive actions.
Both agents perfectly observe the state each period (though the time t state is realized after agent 1 makes his time t decision). Both players also see perfectly the action a taken by player 2 and the payoff paid to agent 1.
Information is asymmetric in that only agent 2 sees the cost of his actions β hence only agent 2 knows if the selected action is productive.
The realizations of the function b(a) are drawn from a known distribution B with compact support and held fixed throughout the game. Agents do not know these realizations ex ante, but share the same prior about the distribution from which they were drawn.
Agent 1 has an improper uniform prior over which actions are productive. Specifically
$$\forall A \subset \mathcal{A}, \forall a \in A, \quad \text{prob} \left{a \in \mathcal{N} | \text{card} A \cap N = n\right} = \frac{n}{ \text{card} A}$$
Parameters are such that agent 1 doesnβt exit in period 11.
Equilibrium concept
Let dtβββ{S,βE} be agent 1βs decision to stay or exit at time 1. If dtβ=βE, then Atβ=βatβ=βbΜ(at)β . There are three types of history:
- β1 are histories of the form h1β=β{d1,A1,a1,bΜ(a1),β¦,dtβ ββ 1,Atβ ββ 1,atβ ββ 1bΜ(atβ ββ 1)} and correspond to agent 1βs information set at his decision node in period t
- β2|1 histories are h1β ββ {dt,βAt} are agent 1βs information at agent 2βs decision node in period t.
- β2 histories are h2|1β ββ π© are agent 2βs information set at his decision node in period t.
A pure strategy for player 1 is s1β:ββ1βββ{S,βE}. A pure strategy for player 2 is s2β:ββ2βββπ such that forall histories ht2ββββ2, s2(ht2)βββAt.
The equilibrium concept is a Pareto efficient perfect Bayesian equilibrium in pure strategies. The strategy profiles form the equilibrium.
Equilibrium Analysis
Complete information benchmark
Consider a benchmark model of full information where agent 1 also knows agent 2βs cost function. In all Pareto efficient equilibria of this game, player 1 never chooses to exit.
Because the payoff to agent 1 is bounded above, value functions for player 1 and player 2 are bounded. Denote the highest possible value function for agent 2 in the perfect information game as VΜ2. This will be used in later analysis.
Asymmetric information model
In the asymmetric information model, agent 1 learns which actions are productive. In the early stages of the game, monitoring is imperfect (agent 1 doesnβt know for sure if he got zero payoffs because agent 2 chose an unproductive action, or if a productive action failed.) As agent 1 learns, the game transitions to perfect monitoring.
An interesting feature of the asymmetric information model is that along this path to perfect monitoring, inefficient punishment (exit by player 1) is rational and on the equilibrium path.
Consider an equilibrium (s1,βs2) A history h2|1ββββ2|1 is called a revelation stage if there is non-zero probability that a productive action that has not been taken before will be taken. A history htβ +β 11ββββ1 is a confirmation stage for action aβββAt iff atβ=βa player 1βs payoff is positive (called a confirmation stage because agent 1 now perfectly knows that action a is productive.). In this case we say a was confirmed in this confirmation stage.
A routine is a pair of strategies that starting from a particular history htβ +β 11 include only confirmed or unproductive actions in the continuation game.
Optimal inefficiency
A main result of the paper is that along an equilibrium equilibrium, agent 1βs strategy can include inefficient exit.
To see why consider a history ht2|1 where Nβ²β<βN productive actions have been confirmed.
Then denote by $\underbar{V}_2^{N’}$ the value agent 2 gets by selecting a non-productive action in every period along paths where agent 1 never exits.
Then, if at this history, $\delta (\bar{V} - \underbar{V}_2^{N’}) < c$, exit must occur with positive probability on the continuation path.
This inequality means that the cost of choosing a productive action today, is greater than the discounted difference between maximal value in the full information game and the value of always choosing unproductive actions. In other words, given that agent 1 will never exit, agent 2 has no hope to make up losses in value induced by the cost today of choosing a productive action.
To make up for this, agent 1 then sometimes must choose to exit, which prevents agent 2 from gaining the flow utility, and lowers the value of the strategy of never choosing productive actions.
In this sense inefficient exit is used as a vehicle to motivate agent 2 to reveal information.
Chassang, Sylvain. 2010. βBuilding routines: Learning, cooperation, and the dynamics of incomplete relational contracts.β American Economic Review 100 (1):448β65. https://doi.org/10.1257/aer.100.1.448.
In the first period agent 1 makes his decision based solely on his prior. If he chooses to exit, he learns nothing, and next period must make his decision with the same information. Because we look for pure strategy equilibria, he will make the same decision forever and the game will be trivial.β©