Jan 1-1, 1970
Abstract
This talk concerns with a two-agent non-stationary discret-time stochastic game under the probability criterion, which focuses on the probability that the accumulated rewards of agent 1 (i.e., the costs of agent 2) exceed a prescribed threshold before the first passage into a target set. We first present two illustrative examples. The first one shows that the probability criterion breaks the implication from a nonzero-sum Nash equilibrium to a zero-sum saddle point. The second demonstrates that the non-stationary game can not be transformed into an equivalent stationary one via the standard state augmentation. Because of the non-stationariness, we introduce the notion of the n-th value of the game from time n onwards. Under a mild condition, we prove that the sequence of the n-th values is the unique solution of the system of Shapley equations for the probability criterion. From the system of Shapley equations, we establish the existence of the value and a saddle-point for the game, give an iteration algorithm for computing the approximation value and \epsilon-saddle-points of the game, and provide an explicit error bound. Finally, an energy management numerical example is presented to illustrate the theoretical results and the effectiveness of the proposed algorithm.
Math Home
© 2015. All Rights Reserved.
Department of Mathematics, SUSTech