services
Cuadernos de economía
On-line version ISSN 0717-6821
Cuad. econ. vol.42 no.126 Santiago Nov. 2005
doi: 10.4067/S0717-68212005012600001
| Cuadernos de Economía, Vol. 42 (Noviembre), pp. 199-207, 2005
Invited Paper: Learning and Belief Based Trade*
DREW FUDENBERG1**, DAVID K. LEVINE2 1Harvard University 2University of California, Los Angeles and Federal Reserve Bank of Minneapolis
We use the theory of learning in games to show that no-trade results do not require that gains from trade are common knowledge nor that play is a Nash equilibrium. Usamos la teoría del aprendizaje en juegos para demostrar que el resultado de ausencia de comercio (no-trade result) no requiere que las ganancias del comercio sean conocimiento común o que el juego sea un equilibrio de Nash. JEL: D82, D83, D84, G14 Keywords: No-Trade Theorem, Common Knowledge, Learning, Self-Confirming Equilibria, Marginal Best Response Distribution 1. INTRODUCTION The idea of speculation as trading based on information differences is a widespread one both inside and outside of economics. Such phenomenon as betting on horse races, not to speak of speculation in the stock market, are difficult to imagine in a world in which everyone has identical beliefs. Indeed, authors such as Hirshleifer (1975) have argued that the very idea of speculation is meaningless unless there are differences in beliefs. Yet the idea of speculation as information based trading runs quickly afoul of various no-trade theorems. The simplest such result is that if agents are risk averse and have a common prior, and the initial allocation is Pareto-optimal, then in a Nash equilibrium there must be no trade. This follows from the fact that if there were an equilibrium with trade, each agent would at least weakly improve his utility, contradicting the assumption that the initial allocation was optimal. Kreps (1977) and Tirole (1982) prove extensions of this result to rational expectations equilibria with risk-neutral traders. Milgrom and Stokey (1982) show that the assumption of Nash equilibrium can be replaced by the assumption that it is common knowledge that all players prefer the proposed allocation to the initial one. Thus, either Nash equilibrium or common knowledge of agreement to trade, along with a common prior and risk averse agents, implies that there cannot be trade solely on the basis of differences in beliefs. From the viewpoint of non-equilibrium learning theory, though, both the assumption of a common prior on Nature's moves and the assumption of a Nash equilibrium (that is, a common belief on players' strategies) may be too strong. In the theory of learning in games, the assumption of exogenous knowledge about the distribution of moves is replaced with the idea that players acquire knowledge through learning. Thus common beliefs about either Nature's moves or the play of other players may or may not arise, depending on the environment. Consequently, the steady states of standard learning processes correspond not to the Nash equilibria but to the larger class of self-confirming equilibria that we introduced in Fudenberg and Levine (1993).1 In simultaneous-move complete-information games, if players observe the profiles of actions played in each round, the self-confirming equilibria coincide with the set of Nash equilibria of the game.2 By contrast, as argued in Dekel et al. (2004), in games of incomplete information, if players begin with inconsistent priors there are broad classes of games in which the self-confirming equilibria (and hence the steady states of standard learning processes) do not coincide with the Nash equilibria. Nevertheless, there are important classes of incomplete-information games where the steady states of learning models do coincide with the Nash equilibria. For example, Dekel et al. showed that this is the case when players observe one another's actions and there are independent private values. In the trading games that we consider here, it is not plausible that all agents observe one another's actions. Never-the-less the equivalence of Nash and self-confirming equilibria still holds, because the games have the property that each agent knows his own utility function and hence knows the payoff he will get from not trading. As we show, it is this "known security level" property that underlies the no-trade results. In addition, we show that not even self-confirming equilibrium is needed for the no-trade conclusion. Specifically, while the steady states of standard learning processes must be self-confirming equilibrium, there is no guarantee that even well-behaved learning procedures necessarily converge to a steady state. For this reason, we also examine the notion of "marginal best response distributions" introduced by Fudenberg and Levine (1995). If all players follow learning procedures that are moderately rational, then the joint distribution of play must at least converge to the set of these distributions. In both cases, we show that the no-trade theorem applies. The intuition is simple: if agents are risk averse, the only possibility of trade is based on information differences, and if trade takes place, then there must be an agent who would do better not to trade. A player needs not be a terribly clever learner to discover that he is doing poorly, all that is required is that he knows the utility he would get by not trading. So in the long run, all trade must stop. We should emphasize that we are not claiming that in practice there is no trade based on information differences. Rather we are claiming that there must be some other underlying reason for trade, such as portfolio balance, joy of betting on the horses, noise traders who are not rational, before it becomes possible to trade based on information differences. See for example, Zurita (2004) for a model in which underlying gains to balancing portfolios allows trading based on information differences in a model with common knowledge. 2. THE MODEL There are n traders i = 1,
, n. Each trader has finitely many possible types, with trader i 's type denoted qi. The profile of types q is called the state. There are m goods, so the consumption bundle consumed by trader i is xi Î Âm. Trader i's endowment is
The final allocation is determined from endowments by a finite simultaneous-move game.4 Each trader i observes his own type q and then chooses an ai Î Ai action afrom a finite set. Mixed actions are denoted by ai The final allocation is given by xi = ¦i (q,
Each trader has the option of not trading, denoted by
If learning by traders is to be possible, the economy must meet repeatedly. We assume that each time the economy meets the state is determined by an independent draw from a fixed (objective) probability distribution r that is unknown to the traders. Traders do not necessarily observe the realized value q of so if they start out with incorrect beliefs about r, it is not obvious that they will learn the true distribution. Since we are interested only in trade due to differences in beliefs, we must rule out other reasons for trade. Consequently we assume that without differences in beliefs the endowment is ex ante Pareto efficient; that is
We consider two equilibrium concepts that relax Nash equilibrium. The key components of self-confirming (and Nash) equilibrium are each player i's beliefs about Nature's move, her strategy, and her conjecture about the strategies used by her opponents. Player i's beliefs, denoted by Of course, what players might learn from repeated play depends on what they observe at the end of each round of play. To model this, the equilibrium concepts suppose that after each play of the game, players receive private signals yi =yi ( Our equilibrium concept is a variation on the type of self -confirming equilibrium defined in Fudenberg and Levine (1993) and Dekel et al. (2004). Definition 1: A strategy profile s is an e-self-confirming equilibrium with conjectures
and for any pair qi,
We say that s is a self-confirming equilibrium if there is some collection ( Our key assumption is that each trader observes enough information to determine her utility from the no-trade action. For example, if the endowment represents some complicated stock portfolio, and the trader engages in a complicated series of trades, if the trader does not observe the prices of stocks that were held in positive quantities in her endowment, but were traded away, then she may not be able to determine the utility of not having traded at all.
This immediately implies the following sufficient condition for an -self-confirming equilibrium, which underlies our first result:
This says that the expected utility from the action actually taken gives within of the utility from the endowment. The idea of self-confirming equilibrium is that we do not require that players beliefs about what they did not see opponents do be correct. However, there is no general theorem guaranteeing the global convergence of a sensible class of learning procedures to a self-confirming equilibrium. This leads us our second "equilibrium" notion, a variation of/on the idea of a marginal best response distribution introduced in Fudenberg and Levine (1995).
This says that the utility that player i actually gets is at least within of the most he could get against the marginal distribution of opponents actions; that is, correlations are ignored. The significance of this notion is that there exist a broad class of approximately universally consistent learning strategies such that if players use those strategies, asymptotic play will be close to an approximate marginal best response distribution even if it never converges. From the definition, it appears that it is necessary that players observe their opponents actions. However, Fudenberg and Levine (1998) and Hart and Mas-Colell (2001) show that there are learning procedures that give this result when players observe only their own action and own utility. In particular, Assumption 5 need not be satisfied for these learning procedures to work. In other words, marginal best response distributions capture long-run non-equilibrium play under the very weak assumption that players know their past actions and payoffs. 3. THE RESULT Our conclusion is that in the limit e ® 0 as for either self-confirming or marginal best-response there is convergence to no-trade. The idea is that under our assumption of strict concavity of the utility functions, any probability distribution over socially feasible allocations that Pareto dominates the endowment must involve no-trade. As e ® 0 both e-self confirming equilibria and-marginal best response distributions give each trader at least the utility that they could get from their endowment, and so the limiting allocation must weakly Pareto dominate the endowment. First we show that socially feasible allocations that weakly Pareto dominate the endowment involve no trade.
Proof: Since Our main results now say that in the limit both self-confirming equilibria and marginal best response equilibria involve no trade. In the case of self-confirming, the fact that each trader gets at least the endowment utility in the limit follows from upper hemi-continuity of the -equilibrium correspondence and the fact that traders know the endowment utility.
This contradicts Lemma 2. In the case of -marginal best response distributions the fact that each trader gets at least the endowment utility in the limit follows from the fact that a marginal best response distribution gives each player at least the minmax.
this again contradicts Lemma 2.
NOTES *We thank Felipe Zurita for comments and encouragement. We are grateful to NSF grants SES-01-12018, SES-03-14713, and SES-04-26199 for financial support. **E-mails: dfudenberg@harvard.edu, david@dklevine.com 1See also Battigalli (1987), Fudenberg and Kreps (1988,1995), and Rubinstein and Wolinksy (1994). 2We will not formally model the dynamics of learning, but we have in mind "belief-based" processes in which players base their actions on their beliefs about opponents' play. Fudenberg and Kreps (1995) and Fudenberg and Levine (1993b) showed that the long-run outcomes of such processes correspond to the self-confirming equilibria; they considered general extensive form games and supposed that the signals corresponded to the terminal nodes of the game. 3Since a player's type is supposed to encapsulate all private information available to him, and since we presume players know their own endowments before beginning trading, a player's own type should determine his endowment. 4Or the game may be an elaborate dynamic game, in which case our simultaneous move game represents the strategic form. 5We consider the case in which knowledge of opponents' play comes only from learning by observation and updating, and not from deduction based on opponents' rationality, so we do not require that players know their opponents' utility functions or beliefs. Rubinstein and Wolinsky (1994), Battigalli and Guaitoli (1997) and Dekel, Fudenberg and Levine (1999) present solution concepts based on steady states in which players do make deductions based on rationality of the other players. 6It is appropriate to have a single 7That is, if for some REFERENCES Battigalli, P. (1987),"Comportamento Razionale Ed Equilibrio Nei Giochi E Nelle Situazioni Sociali", unpublished undergraduate dissertation, Bocconi University, Milano. [ Links ] Battigalli, P. and D. Guaitoli (1997), "Conjectural Equilibria and Rationalizability in a Game with Incomplete Information", in Decision, Games and Markets, P. Battigalli, A. Montesano and F. Panunzi, Eds., Dordrecht: Kluwer Academic Publishers. [ Links ] Dekel, E., D. Fudenberg and D. K. Levine (1999),"Payoff Information and Self-Confirming Equilibrium", Journal of Economic Theory, 89(2): 165-85. [ Links ] Dekel, E., D. Fudenberg and D. K. Levine (2004),"Learning to Play Bayesian Games", Games and Economic Behavior, 46: 282-303. [ Links ] Fudenberg, D. and D. Kreps (1988), "A Theory of Learning, Experimentation, and Equilibrium in Games", unpublished mimeo. [ Links ] Fudenberg, D. and D. Kreps (1995),"Learning in Extensive-Form Games. I. Self-Confirming Equilibria", Games and Economic Behavior, 8(1): 20-55. [ Links ] Fudenberg, D. and D. K. Levine (1993),"Self-Confirming Equilibrium", Econometrica, 61: 523-546. [ Links ] Fudenberg, D. and D. K. Levine (1995),"Consistency and Cautious Fictitious Play", Journal of Economic Dynamics and Control, 19: 1065-1090. [ Links ] Fudenberg, D. and D. K. Levine (1998), The Theory of Learning in Games, MIT Press, Cambridge, MA. [ Links ] Hart, S. and A. Mas-Colell (2001),"A General Class of Adaptive Strategies", Journal of Economic Theory, 98: 26-54. [ Links ] Hirshleifer, J. (1975),"Speculation and Equilibrium: Information, Risk, and Markets", The Quarterly Journal of Economics, 89: 519-542. [ Links ] Kalai, E. and E. Lehrer (1993), "Rational Learning Leads to Nash Equilibrium", Econometrica, 61(5), 1019-45. [ Links ] Kreps, D. (1977), "A Note on "Fulfilled Expectations" Equilibria", Journal of Economic Theory, 14(1): 32-43. [ Links ] Milgrom, P. and N. Stokey (1982),"Information, Trade and Common Knowledge", Journal of Economic Theory, 26: 17-27. [ Links ] Rubinstein, A. and A. Wolinsky (1994), "Rationalizable Conjectural Equilibrium: Between Nash and Rationalizability", Games and Economic Behavior, 6(2):299-311. [ Links ] Tirole, J. (1982),"On the Possibility of Trade under Rational Expectations", Econometrica, 50: 1163-1182. [ Links ] Zurita, F. (2004),"On the Limits to Speculation in Centralized versus Decentralized Market Regimes", Journal of Financial Intermediation, 13: 378-408. [ Links ] |


















