Bellman equation – Knowledge and References

Explore chapters and articles related to this topic

Signature Generation Algorithms for Polymorphic Worms

Published in Mohssen Mohammed, Al-Sakib Khan Pathan, Automatic Defense Against Zero-day Polymorphic Worms in Communication Networks, 2016

Mohssen Mohammed, Al-Sakib Khan Pathan

Richard Bellman first used the term dynamic programming in the 1940s for describing the process of solving problems when one needs to find the best decisions one after another. By 1953, he refined this to the modern meaning, referring specifically to nesting smaller decision problems inside larger decisions, and the field was thereafter recognized by the IEEE (Institute of Electrical and Electronics Engineers) as a systems analysis and engineering topic. Bellman’s contribution is remembered in the name of the Bellman equation, which is a central result of dynamic programming that restates an optimization problem in recursive form.

On the equivalence of the integral and differential Bellman equations in impulse control problems

View Article

Journal Information

Published in International Journal of Control, 2022

Francois Dufour, Alexey Piunovskiy, Alexander Plakhov

The dynamic programming approach in Avrachenkov et al. (2015) was based on the differential Bellman equation We leave aside the conditions when this equation has a solution coincident with the Bellman function (the minimal discounted cost when starting from the state ) and also the properties of the function W and of the infima with respect to and in Equation (30). The interested reader can find a meaningful example of the successful solving Equation (30) and constructing the optimal control strategy in Avrachenkov et al. (2015). We only intend to illustrate how one can apply Theorem 2.1 in this situation. Note, the flow in is not fixed so far.

Reinforcement learning for dynamic condition-based maintenance of a system with individually repairable components

View Article

Journal Information

Published in Quality Engineering, 2020

Nooshin Yousefi, Stamatis Tsianikas, David W. Coit

Dynamic programing is an algorithm used to solve complex problems. The problem is solved in distinct stages using recursive functions. The solution of each stage or sub-problem is stored and reused to find the overall optimal solution of the problem. In this article, dynamic programing is used to find the best policy of Markov decision processes in reinforcement learning. The Bellman equation decomposes the overall optimal value into the optimal policy of each step and optimal value of remaining steps. The value function can be used to restore and retrieve the solution of each sub-problem.

POMDP and MOMDP solutions for structural life-cycle cost minimization under partial and mixed observability

View Article

Journal Information

Published in Structure and Infrastructure Engineering, 2018

Konstantinos G. Papakonstantinou, Charalampos P. Andriotis, Masanobu Shinozuka

where is the typical discount factor that expresses the fact that immediate rewards are preferred over future ones. This formulation can be applied to either stationary or nonstationary environments and higher order Markov models under proper state augmentation procedures (Papakonstantinou & Shinozuka, 2014a). A solution of the Bellman equation can be readily obtained either through value iteration, policy evaluation, or linear programming (Bertsekas, 2005).