Explore chapters and articles related to this topic
Strategy Learning
Published in Hamidou Tembine, Distributed Strategic Learning for Wireless Engineers, 2018
Considering all possible signal structures generate all correlated equilibria. If the signals are stochastically independent across the players, it is a Nash equilibrium in mixed or pure strategies of the original game. But the signals could well be correlated, in which case new equilibria can be obtained. Equivalently, a correlated equilibrium is a probability distribution on actions profiles, which can be interpreted as the distribution of play instructions given to the players by some “device” or “referee”. Each player is given privately recommendations for his own play only; the joint distribution is known to all of them. Also, for every possible recommendation that a player receives, the player realizes that the instruction provides a best response to the random estimated play of the other players assuming they all follow their recommendations. In [58], Foster and Vohra (1997) have provided a procedure converging to the set of correlated equilibria. The author in [70] gives a simple procedure called no-regret dynamics that generates trajectories of play that almost surely converge to the set of correlated equilibria. The result has been extended in [149] to convex and compact set in Hausdorff space (separated space)1 .
Decentralization cost in two-machine job-shop scheduling with minimum flow-time objective
Published in IISE Transactions, 2020
The inequalities in the definition of a correlated equilibrium require that machine m, whenever it is supposed to choose its own sequence qm, does not benefit from deviating to any other sequence These conditions can be written as constraints in a Linear Program (LP), as in the following formulation: This LP takes as input the flow-time for all system sequences q. The first set of constraints are the inequalities in the definition of a correlated equilibrium. The remaining constraints require that pq form a probability distribution over system sequences. Notice that the probabilities pq are the only decision variables. These probabilities do not necessarily assume independence between the marginal probabilities the machines assign to their pure strategies, i.e., unlike mixed Nash equilibria, correlated equilibria may have interdependencies between these marginal probabilities. Thus, any pure/mixed Nash equilibrium is a correlated equilibrium, but not vice versa. The purpose of the objective function in the LP is to find the best equilibrium by minimizing the system objective among all equilibria. The solutions were obtained again using the CPLEX solver. For each instance, the centralized optimal solution(s) was calculated via exhaustive search (with no additional computational effort, as the input is needed anyway for the LP). Due to scalability limitations, only small-sized instances were studied (for example, in a six-job instance, the number system sequences is so the number entries in the LP matrix is ). This allowed obtaining the highest DC values and corresponding processing durations for up to five jobs. Still such settings may be found within real-life organizations.