Chassang2010 (Building routines: Learning, cooperation, and the dynamics of incomplete relational contracts)

20 Oct, 2015 · Read in about 6 min · (1168 Words)

Model

Mechanics

Two agents, infinite horizon, discrete time, and each agent has a common discount factor δ.

Each period agent 1 can choose to enter or exit.

Player 2 has a countably finite set of actions, represented as the set 𝒜, which is equal to the natural numbers. Each period an i.i.d subset of A ⊂ 𝒜 is drawn. This is both the state of the economy in each period and agent 2’s feasible actions set for the period. Each action is a ∈ 𝒜 is drawn with the same probability p each period.

Each period t is broken into two stages. In stage 1, agent 1 chooses to enter or exit. Upon exit, both players get 0 flow payoffs, the game moves to period t + 1.

If agent 1 chooses to enter, he pays a constant fixed cost k while agent 2 gets a benefit π. Then the iid state A_t ⊂ 𝒜 is drawn and stage 2 begins.

In stage 2 player 2 chooses his action a ∈ A_t. This choice brings a deterministic cost c(a). With probability q the action succeeds and agent 1 gains a deterministic payoff b(a). With probability 1 − q the action fails and agent 1 gets 0.

Technology

In the set 𝒜 there are N productive actions (numbered 0 to N − 1).

The function c(a) takes on a constant value c for all productive actions, and 0 for all non-productive actions.

The function b(a) is positive (though not constant) for all productive actions, and 0 for all other actions.

There is no mechanism for agents to share utility.

Information

All agents know perfectly the parameters δ (discount rate), p (probability at which a particular action is drawn from 𝒜 each period), q (probability a productive action is successful), and N the total number of productive actions.

Both agents perfectly observe the state each period (though the time t state is realized after agent 1 makes his time t decision). Both players also see perfectly the action a taken by player 2 and the payoff paid to agent 1.

Information is asymmetric in that only agent 2 sees the cost of his actions – hence only agent 2 knows if the selected action is productive.

The realizations of the function b(a) are drawn from a known distribution B with compact support and held fixed throughout the game. Agents do not know these realizations ex ante, but share the same prior about the distribution from which they were drawn.

Agent 1 has an improper uniform prior over which actions are productive. Specifically

$$\forall A \subset \mathcal{A}, \forall a \in A, \quad \text{prob} \left{a \in \mathcal{N} | \text{card} A \cap N = n\right} = \frac{n}{ \text{card} A}$$

Parameters are such that agent 1 doesn’t exit in period 1¹.

Equilibrium concept

Let d_t ∈ {S, E} be agent 1’s decision to stay or exit at time 1. If d_t = E, then A_t = a_t = b̃(a_t)∅. There are three types of history:

ℋ¹ are histories of the form h¹ = {d₁,A₁,a₁,b̃(a₁),…,d_t − 1,A_t − 1,a_t − 1b̃(a_t − 1)} and correspond to agent 1’s information set at his decision node in period t
ℋ^2|1 histories are h¹ ⊔ {d_t, A_t} are agent 1’s information at agent 2’s decision node in period t.
ℋ² histories are h^2|1 ⊔ 𝒩 are agent 2’s information set at his decision node in period t.

A pure strategy for player 1 is s₁ : ℋ¹ → {S, E}. A pure strategy for player 2 is s₂ : ℋ² → 𝒜 such that forall histories h_t² ∈ ℋ², s₂(h_t²) ∈ A_t.

The equilibrium concept is a Pareto efficient perfect Bayesian equilibrium in pure strategies. The strategy profiles form the equilibrium.

Equilibrium Analysis

Complete information benchmark

Consider a benchmark model of full information where agent 1 also knows agent 2’s cost function. In all Pareto efficient equilibria of this game, player 1 never chooses to exit.

Because the payoff to agent 1 is bounded above, value functions for player 1 and player 2 are bounded. Denote the highest possible value function for agent 2 in the perfect information game as V̄₂. This will be used in later analysis.

Asymmetric information model

In the asymmetric information model, agent 1 learns which actions are productive. In the early stages of the game, monitoring is imperfect (agent 1 doesn’t know for sure if he got zero payoffs because agent 2 chose an unproductive action, or if a productive action failed.) As agent 1 learns, the game transitions to perfect monitoring.

An interesting feature of the asymmetric information model is that along this path to perfect monitoring, inefficient punishment (exit by player 1) is rational and on the equilibrium path.

Consider an equilibrium (s₁, s₂) A history h^2|1 ∈ ℋ^2|1 is called a revelation stage if there is non-zero probability that a productive action that has not been taken before will be taken. A history h_t + 1¹ ∈ ℋ¹ is a confirmation stage for action a ∈ A_t iff a_t = a player 1’s payoff is positive (called a confirmation stage because agent 1 now perfectly knows that action a is productive.). In this case we say a was confirmed in this confirmation stage.

A routine is a pair of strategies that starting from a particular history h_t + 1¹ include only confirmed or unproductive actions in the continuation game.

Optimal inefficiency

A main result of the paper is that along an equilibrium equilibrium, agent 1’s strategy can include inefficient exit.

To see why consider a history h_t^2|1 where N′ < N productive actions have been confirmed.

Then denote by $\underbar{V}_2^{N’}$ the value agent 2 gets by selecting a non-productive action in every period along paths where agent 1 never exits.

Then, if at this history, $\delta (\bar{V} - \underbar{V}_2^{N’}) < c$, exit must occur with positive probability on the continuation path.

This inequality means that the cost of choosing a productive action today, is greater than the discounted difference between maximal value in the full information game and the value of always choosing unproductive actions. In other words, given that agent 1 will never exit, agent 2 has no hope to make up losses in value induced by the cost today of choosing a productive action.

To make up for this, agent 1 then sometimes must choose to exit, which prevents agent 2 from gaining the flow utility, and lowers the value of the strategy of never choosing productive actions.

In this sense inefficient exit is used as a vehicle to motivate agent 2 to reveal information.

Chassang, Sylvain. 2010. “Building routines: Learning, cooperation, and the dynamics of incomplete relational contracts.” American Economic Review 100 (1):448–65. https://doi.org/10.1257/aer.100.1.448.

In the first period agent 1 makes his decision based solely on his prior. If he chooses to exit, he learns nothing, and next period must make his decision with the same information. Because we look for pure strategy equilibria, he will make the same decision forever and the game will be trivial.↩