|
Article Excerpt For an n-person stochastic game with Borel state space S and compact metric action sets [A.sup.1], [A.sup.2], ..., [A.sup.n], sufficient conditions are given for the existence of subgame-perfect equilibria. One result is that such equilibria exist if the law of motion q(* | s, a) is, for fixed s, continuous in a = ([a.sup.1], [a.sup.2], ..., [a.sup.n]) for the total variation norm and the payoff functions [f.sup.1], [f.sup.2], ..., [f.sup.n] are bounded, Borel measurable functions of the sequence of states ([s.sub.1], [s.sub.2], ....) [member of] [S.sup.N ]and, in addition, are continuous when SN is given the product of discrete topologies on S.
Key words: stochastic games; subgame-perfect equilibria; Borel sets; finitary functions
1. Introduction. The stochastic games we study have n players 1, 2, ..., n. The state space S is a Borel subset of a Polish space. Each player i has a compact metric action set [A.sup.i]. The set [DELTA] (A.sup.i]) of probability measures defined on the Borel subsets of [A.sup.i] is equipped with its usual weak topology and, hence, [DELTA]([A.sup.i]) is also compact metrizable. Let A = [A.sup.1] x [A.sup.2] x ... x [A.sup.n] have its product topology and it too is compact metrizable. The law of motion q is a conditional probability distribution on S given S x A with the interpretation that, if the players choose actions a = ([a.sup.1], [a.sup.2], ..., [a.sup.n] [member of] A at state s [member of] S, then q(* | s, a) is the conditional distribution of the next state. We always assume that q(* | s, a) is Borel measurable jointly in (s, a) for B a Borel subset of S and we will make further assumptions below.
The game begins at some initial state [s.sub.1]. The players choose actions [a.sub.1] = ([a.sup.1.sub.1], [a.sup.2.sub.1], ..., [a.sup.n.sub.1]) and the next state [s.sub.2] has distribution q(x |[s.sub.1], [a.sub.1]). This process is iterated to generate a random history or play
h = ([s.sub.1], ([a.sub.1], [s.sub.2]), ([a.sub.2], [a.sub.3]), ...) [member of] H = S x [(A x S).sup.N].
Here, [a.sub.k] = ([a.sup.1.sub.k], [a.sup.2.sub.k], ..., [a.sup.n.sub.k]) is, for each k, the n-tuple of actions chosen by the players at stage k. Each player i has a bounded, Borel measurable payoff function [f.sup.1]: H [??] R and receives [f.sup.i](h) as payoff at history h. Let f = ([f.sup.1], [f.sup.2], ..., [f.sup.n]) be the n-tuple of payoff functions.
Denote by [H.sup.*] the disjoint union of the sets S, S x (A x S), S x [(A x S).sup.2], ...; that is,
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
The elements of [H.sup.*] are called partial histories.
A strategy [[sigma].sup.i] for player i assigns to each partial history p = ([S.sub.1], ([a.sub.1], [S.sub.2]), ..., ([a.sub.k-1], [s.sub.k])) in [H.sup.*] the conditional distribution [[sigma].sup.i](p) [member of] [DELTA]([A.sup.i]) for [a.sup.i.sub.k] given p. Formally, a strategy for player i is a Borel function from [H.sup.*] into [DELTA]([A.sup.i]). It is assumed that the players choose their actions independently. So the conditional distribution of [a.sub.k] given p is the product measure
[sigma](p) = [[sigma].sup.1](p) x [[sigma].sup.2](p) x ... x [[sigma].sup.n](p) (1)
on A.
An n-tuple [sigma] = ([[sigma].sup.1], [[sigma].sup.2], ..., [[sigma].sup.n]) consisting of a strategy for each player is called a profile. A profile [sigma] together with an initial state [s.sub.1] determines the distribution [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] of the history h. Note that, by an abuse of notation, we write [sigma](p) for the n-tuple ([[sigma].sup.1](p), [[sigma].sup.2](p), ..., [[sigma].sup.n](p)) as well as for the product measure in (1). The meaning will always be clear from the context.
The stochastic game [GAMMA](f, [s.sub.1]) begins at state [s.sub.1]. The players select strategies to form a profile [sigma] and each player i receives the expected payoff
[E.sub.[sigma]][f.sup.i] = [integral][f.sup.i](h)[P.sub.[sigma]](dh).
We write [E.sub.[sigma]]f for the vector of expected payoffs [E.sub.[sigma]][f.sup.1], [E.sub.[sigma]][f.sup.2], ..., [E.sub.[sigma]][f.sup.n].
To define a subgame of [GAMMA](f, [s.sub.1]), let p = ([s.sub.1], ([a.sub.1], [s.sub.2]), ..., ([a.sub.k-1], [s.sub.k])) be a partial history. The length of p, written lh(p), is defined to be k - 1. (Thus lh(s) = for s [member of] S.) Denote by l(p) the last state [s.sub.k] of p. The subgame [GAMMA](f, p) is the stochastic game with initial state [s.sub.k] = l(p), and payoff functions [f.sup.i]p defined for histories
h'= ([s.sub.k], ([b.sub.1], [t.sub.2]), ([b.sub.2], [t.sub.3]), ...)
to be
([f.sup.i])(h') = [f.sup.i]([s.sub.1], ([a.sub.1], [s.sub.2]), ..., ([a.sub.k-1], [s.sub.k]), ([b.sub.1], [t.sub.2]), ([b.sub.2], [t.sub.3]), ...). (2)
Thus [f.sup.i]p is the section of [f.sup.i] at p. We use [GAMMA](f, *) to denote the collection of all the games [GAMMA](f, p), p [member of] [H.sup.*]. (Notice that the game [GAMMA](f, s) is itself the subgame [GAMMA](f, p) for which p = s.)
For a strategy [[sigma].sup.i] and a partial history p, the conditional strategy given p is written [[sigma].sup.i][p]. If [sigma] = ([[sigma].sup.1], [[sigma].sup.2], ..., [[sigma].sup.n]) is a profile of strategies, the conditional profile [sigma][p] is just the profile of conditional strategies ([[sigma].sup.1], [p], [[sigma].sup.2][p], ..., [[sigma].sup.n][p]).
Now, we can define a subgame-perfect equilibrium (SPE) for F(f, x) as being a profile [sigma] such that, for all p [member of][H.sup.*], the conditional profile [sigma][p] is a Nash equilibrium for the subgame [GAMMA] (f, p).
To prove the existence of an SPE, further conditions on the game are necessary. (Without additional assumptions, there need not exist an SPE even for one-person games.) To state one of the conditions, let [DELTA](S) be the set of probability measures defined on the Borel subsets of the state space S.
CONDITION 1 (VARIATION NORM CONTINUITY). For every fixed s [member of] S, the law of motion q(* | s, a) is a continuous function of a when [DELTA](S) is given the topology induced by the total variation norm.
This is a very strong condition and it would be preferable to assume some sort of weak continuity for the law of motion such as Condition 3 in [section]3 below. However, as is explained in the introduction of Mertens and Parthasarathy [10], it seems difficult to do without Condition 1.
A key assumption on the payoffs needed for our results is inspired by Dubins and Savage [1], who study (finitely additive) probability measures on an infinite product of spaces, each of which is given the discrete topology.
DEFINITION 1.1. Suppose that each of the nonempty sets [X.sub.1], [X.sub.2], ... has the discrete topology and the product Y = [X.sub.1] x [X.sub.2] x ... has the product topology. Then, a continuous function g defined on Y is called DS continuous.
It follows from Definition 1.1 that a function g: Y [??] R is DS continuous if and only if, for each x = ([x.sub.1], [x.sub.2], ...) [member of] Y and [epsilon] > 0, there exists n such that [absolute value of g(x) - g(x')] < [epsilon] for every x' = ([x'.sub.1], [x'.sub.2], ...) such that [x.sub.i] = [x'.sub.i] for 1 [less than or equal to] i [less than or equal to] n. If Y = [X.sub.1] x [X.sub.2] x ... is a product of Borel sets, some of which are uncountable, then continuity in the product of the Borel topologies is a more stringent requirement than DS continuity. The function g is called finitary if it depends only on the coordinates [x.sub.1], [x.sub.2], ..., [x.sub.t], for some stop rule t such that t(x) < [infinity] for all x = ([x.sub.1], [x.sub.2], ...). Finitary functions are clearly DS continuous and, in fact, the DS-continuous functions are precisely those functions that can be uniformly approximated by finitary functions. (A Borel measurable version of this fact is Lemma 3.1 below. See Dubins and Savage [1] for a discussion of the properties of finitary functions.) Functions that depend on the tail of...
|