Home | Business News | Browse by Publication | M | Mathematics of Operations Research

Subgame-perfect equilibria for stochastic games.

Publication: Mathematics of Operations Research
Publication Date: 01-AUG-07
Format: Online
Delivery: Immediate Online Access

Article Excerpt
For an n-person stochastic game with Borel state space S and compact metric action sets [A.sup.1], [A.sup.2], ..., [A.sup.n], sufficient conditions are given for the existence of subgame-perfect equilibria. One result is that such equilibria exist if the law of motion q(* | s, a) is, for fixed s, continuous in a = ([a.sup.1], [a.sup.2], ..., [a.sup.n]) for the total variation norm and the payoff functions [f.sup.1], [f.sup.2], ..., [f.sup.n] are bounded, Borel measurable functions of the sequence of states ([s.sub.1], [s.sub.2], ....) [member of] [S.sup.N ]and, in addition, are continuous when SN is given the product of discrete topologies on S.

Key words: stochastic games; subgame-perfect equilibria; Borel sets; finitary functions

1. Introduction. The stochastic games we study have n players 1, 2, ..., n. The state space S is a Borel subset of a Polish space. Each player i has a compact metric action set [A.sup.i]. The set [DELTA] (A.sup.i]) of probability measures defined on the Borel subsets of [A.sup.i] is equipped with its usual weak topology and, hence, [DELTA]([A.sup.i]) is also compact metrizable. Let A = [A.sup.1] x [A.sup.2] x ... x [A.sup.n] have its product topology and it too is compact metrizable. The law of motion q is a conditional probability distribution on S given S x A with the interpretation that, if the players choose actions a = ([a.sup.1], [a.sup.2], ..., [a.sup.n] [member of] A at state s [member of] S, then q(* | s, a) is the conditional distribution of the next state. We always assume that q(* | s, a) is Borel measurable jointly in (s, a) for B a Borel subset of S and we will make further assumptions below.

The game begins at some initial state [s.sub.1]. The players choose actions [a.sub.1] = ([a.sup.1.sub.1], [a.sup.2.sub.1], ..., [a.sup.n.sub.1]) and the next state [s.sub.2] has distribution q(x |[s.sub.1], [a.sub.1]). This process is iterated to generate a random history or play

h = ([s.sub.1], ([a.sub.1], [s.sub.2]), ([a.sub.2], [a.sub.3]), ...) [member of] H = S x [(A x S).sup.N].

Here, [a.sub.k] = ([a.sup.1.sub.k], [a.sup.2.sub.k], ..., [a.sup.n.sub.k]) is, for each k, the n-tuple of actions chosen by the players at stage k. Each player i has a bounded, Borel measurable payoff function [f.sup.1]: H [??] R and receives [f.sup.i](h) as payoff at history h. Let f = ([f.sup.1], [f.sup.2], ..., [f.sup.n]) be the n-tuple of payoff functions.

Denote by [H.sup.*] the disjoint union of the sets S, S x (A x S), S x [(A x S).sup.2], ...; that is,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The elements of [H.sup.*] are called partial histories.

A strategy [[sigma].sup.i] for player i assigns to each partial history p = ([S.sub.1], ([a.sub.1], [S.sub.2]), ..., ([a.sub.k-1], [s.sub.k])) in [H.sup.*] the conditional distribution [[sigma].sup.i](p) [member of] [DELTA]([A.sup.i]) for [a.sup.i.sub.k] given p. Formally, a strategy for player i is a Borel function from [H.sup.*] into [DELTA]([A.sup.i]). It is assumed that the players choose their actions independently. So the conditional distribution of [a.sub.k] given p is the product measure

[sigma](p) = [[sigma].sup.1](p) x [[sigma].sup.2](p) x ... x [[sigma].sup.n](p) (1)

on A.

An n-tuple [sigma] = ([[sigma].sup.1], [[sigma].sup.2], ..., [[sigma].sup.n]) consisting of a strategy for each player is called a profile. A profile [sigma] together with an initial state [s.sub.1] determines the distribution [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] of the history h. Note that, by an abuse of notation, we write [sigma](p) for the n-tuple ([[sigma].sup.1](p), [[sigma].sup.2](p), ..., [[sigma].sup.n](p)) as well as for the product measure in (1). The meaning will always be clear from the context.

The stochastic game [GAMMA](f, [s.sub.1]) begins at state [s.sub.1]. The players select strategies to form a profile [sigma] and each player i receives the expected payoff

[E.sub.[sigma]][f.sup.i] = [integral][f.sup.i](h)[P.sub.[sigma]](dh).

We write [E.sub.[sigma]]f for the vector of expected payoffs [E.sub.[sigma]][f.sup.1], [E.sub.[sigma]][f.sup.2], ..., [E.sub.[sigma]][f.sup.n].

To define a subgame of [GAMMA](f, [s.sub.1]), let p = ([s.sub.1], ([a.sub.1], [s.sub.2]), ..., ([a.sub.k-1], [s.sub.k])) be a partial history. The length of p, written lh(p), is defined to be k - 1. (Thus lh(s) = for s [member of] S.) Denote by l(p) the last state [s.sub.k] of p. The subgame [GAMMA](f, p) is the stochastic game with initial state [s.sub.k] = l(p), and payoff functions [f.sup.i]p defined for histories

h'= ([s.sub.k], ([b.sub.1], [t.sub.2]), ([b.sub.2], [t.sub.3]), ...)

to be

([f.sup.i])(h') = [f.sup.i]([s.sub.1], ([a.sub.1], [s.sub.2]), ..., ([a.sub.k-1], [s.sub.k]), ([b.sub.1], [t.sub.2]), ([b.sub.2], [t.sub.3]), ...). (2)

Thus [f.sup.i]p is the section of [f.sup.i] at p. We use [GAMMA](f, *) to denote the collection of all the games [GAMMA](f, p), p [member of] [H.sup.*]. (Notice that the game [GAMMA](f, s) is itself the subgame [GAMMA](f, p) for which p = s.)

For a strategy [[sigma].sup.i] and a partial history p, the conditional strategy given p is written [[sigma].sup.i][p]. If [sigma] = ([[sigma].sup.1], [[sigma].sup.2], ..., [[sigma].sup.n]) is a profile of strategies, the conditional profile [sigma][p] is just the profile of conditional strategies ([[sigma].sup.1], [p], [[sigma].sup.2][p], ..., [[sigma].sup.n][p]).

Now, we can define a subgame-perfect equilibrium (SPE) for F(f, x) as being a profile [sigma] such that, for all p [member of][H.sup.*], the conditional profile [sigma][p] is a Nash equilibrium for the subgame [GAMMA] (f, p).

To prove the existence of an SPE, further conditions on the game are necessary. (Without additional assumptions, there need not exist an SPE even for one-person games.) To state one of the conditions, let [DELTA](S) be the set of probability measures defined on the Borel subsets of the state space S.

CONDITION 1 (VARIATION NORM CONTINUITY). For every fixed s [member of] S, the law of motion q(* | s, a) is a continuous function of a when [DELTA](S) is given the topology induced by the total variation norm.

This is a very strong condition and it would be preferable to assume some sort of weak continuity for the law of motion such as Condition 3 in [section]3 below. However, as is explained in the introduction of Mertens and Parthasarathy [10], it seems difficult to do without Condition 1.

A key assumption on the payoffs needed for our results is inspired by Dubins and Savage [1], who study (finitely additive) probability measures on an infinite product of spaces, each of which is given the discrete topology.

DEFINITION 1.1. Suppose that each of the nonempty sets [X.sub.1], [X.sub.2], ... has the discrete topology and the product Y = [X.sub.1] x [X.sub.2] x ... has the product topology. Then, a continuous function g defined on Y is called DS continuous.

It follows from Definition 1.1 that a function g: Y [??] R is DS continuous if and only if, for each x = ([x.sub.1], [x.sub.2], ...) [member of] Y and [epsilon] > 0, there exists n such that [absolute value of g(x) - g(x')] < [epsilon] for every x' = ([x'.sub.1], [x'.sub.2], ...) such that [x.sub.i] = [x'.sub.i] for 1 [less than or equal to] i [less than or equal to] n. If Y = [X.sub.1] x [X.sub.2] x ... is a product of Borel sets, some of which are uncountable, then continuity in the product of the Borel topologies is a more stringent requirement than DS continuity. The function g is called finitary if it depends only on the coordinates [x.sub.1], [x.sub.2], ..., [x.sub.t], for some stop rule t such that t(x) < [infinity] for all x = ([x.sub.1], [x.sub.2], ...). Finitary functions are clearly DS continuous and, in fact, the DS-continuous functions are precisely those functions that can be uniformly approximated by finitary functions. (A Borel measurable version of this fact is Lemma 3.1 below. See Dubins and Savage [1] for a discussion of the properties of finitary functions.) Functions that depend on the tail of...

View this article FREE - Now for a Limited Time, try Goliath Business News
Free for 3 Days!



More articles from Mathematics of Operations Research
Subsolutions of an Isaacs equation and efficient schemes for importanc..., August 01, 2007
Complex matrix decomposition and quadratic programming., August 01, 2007

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.