Value function

Value function

Foundations of Risk and Decision Theory

Published in C. Ariel Pinto, Paul R. Garvey, Advanced Risk Analysis in Engineering Enterprise Systems, 2016

A value function is a real-valued mathematical function defined over an evaluation criterion (or attribute) that represents an option’s measure of “goodness” over the levels of the criterion. A measure of “goodness” reflects a decision maker’s judged value in the performance of an option (or alternative) across the levels of a criterion (or attribute).

Transaction selection policy in tier-to-tier SBSRS by using Deep Q-Learning

View Article

Journal Information

Published in International Journal of Production Research, 2022

Bartu Arslan, Banu Yetkin Ekren

Any RL problem includes the following contents (Sutton and Barto 2015): Agent: an autonomous object (e.g. a robot) controlling the target of concern.Environment: it defines everything that agents interact with. It is built for the agent to make it visible like a real-world case. States are representations of that current world or environment.Rewards: it is a score of how the algorithm performs with respect to the environment. The cumulative reward is the goal of the agent to be maximised in an RL problem. The reward is obtained by mapping each state-action pair of the environment to a single value.Policy: it is the algorithm used by the agent to decide its actions. This part can be model-based or model-free. A model-based algorithm applies the transition and the reward functions to predict the optimal policy. A model-free algorithm predicts the optimal policy without the environment’s dynamics, such as transition probabilities. It predicts the ‘value function’ from experience without a transition or reward. Here, the value function is a function evaluating action taken in a state for all states. The policy can be obtained from that value function.

Explore chapters and articles related to this topic

Foundations of Risk and Decision Theory

Transaction selection policy in tier-to-tier SBSRS by using Deep Q-Learning