Differential dynamic programming

Explore chapters and articles related to this topic

Numerical solution of optimal programming and control problems

Published in Arthur E. Bryson, Yu-Chi Ho, Applied Optimal Control, 2018

Differential dynamic programming. Another version of the backward-sweep algorithm has been proposed by Jacobson (1968);† he calls it “differential dynamic programming” (DDP). In DDP, the Hamiltonian is first minimized with respect to u holding x and λ fixed, yielding an improved control uo. Then variations in x and λ and the corresponding further variations in u away from uo are considered; a second-order expansion in δx and δu is minimized with respect to δu, giving rise to a linear feedback δu=−C(t)δx. The new control on the next iteration is then u∘−C(t)δx, This is a refinement of the “min-H” algorithms discussed in Section 7.4. It permits large changes in u and overcomes the difficulty of nonconvex nominal solutions that may occur with the backward sweep algorithm.

View Chapter

Purchase Book

Published in Larry W. Mays, Optimal Control of Hydrosystems, 1997

Larry W. Mays

This section presents the differential dynamic programming method for discrete-time optimal control problems. The term differential dynamic programming (DDP) used by Jacobson and Mayne (1970) broadly refers to stagewise nonlinear programming procedures. Earlier works that basically developed DDP procedures for unconstrained discrete-time control problems, including those by Bellman and Dreyfus (1962), Mayne (1966), Gershwin and Jacobson (1970), Dyer and McReynolds (1970), Jacobson and Mayne (1970), Yakowitz and Rutherford (1984), and Yakowitz (1989), Ohno (1978), and Murray and Yakowitz (1981), have contributed to DDP techniques for constrained optimal control problems.

Groundwater Planning and Management

View Chapter

Purchase Book

Published in Mohammad Karamouz, Azadeh Ahmadi, Masih Akhbari, Groundwater Hydrology, 2020

Mohammad Karamouz, Azadeh Ahmadi, Masih Akhbari

The different successive approximation algorithms such as the differential dynamic programming, discrete differential dynamic programming, and state incremental dynamic programming were developed to overcome this problem. In applying any of these methods, the user should provide an initial estimation of the optimal policy. This estimation is improved in each step until the convergence occurs. However, the limit might be only a local optimum. The survey paper by Yakowitz (1982) gives a summary of the different versions of the DP with applications to the water resources planning and management.

A recursive Riccati interior-point method for chance-constrained stochastic model predictive control

View Article

Journal Information

Published in SICE Journal of Control, Measurement, and System Integration, 2023

Jingyu Zhang, Toshiyuki Ohtsuka

Approximate dynamic programming (ADP) methods, such as differential dynamic programming (DDP) [12] and iterative LQR (iLQR) [13], which yield suboptimal solutions to the Bellman equation using a Riccati-based algorithm, have received considerable attention in recent research. To avoid solving the exact solution of the original Bellman equation, ADP decouples the nested optimization problem and decomposes a large problem across a control sequence into a recursive series of small problems and then recursively solves each single-stage problem. This method has an advantage in terms of computational efficiency compared with collocation methods [14] and shooting methods because the size of each subproblem is time-independent, and the computational complexity grows only linearly with the prediction horizon, which is very attractive for real-time computing. Moreover, ADP is suitable for SMPC because a solution of stochastic dynamic programming (SDP) can provide a more intuitive consideration of closed-loop control performance than other heuristic formulation methods.

Optimal Sewer Network Design for Cities in Hilly Regions

View Article

Journal Information

Published in Urban Water Journal, 2023

Juan Saldarriaga, Juana Herrán, Pedro L. Iglesias-Rey

Many of the proposed approaches tried to sequentially solve the two problems of optimal sewer network design. For example, Steele et al. (2016) presented a model formulated as a mixed integer nonlinear programming (NLP) problem for the layout selection and used the simulated annealing method for the hydraulic design. Moeini and Afshar (2012, 2017) employed the Tree Growing Algorithm to generate layouts and the Arc Colony Optimization Algorithm combined with NLP for the design of pipe diameters and slopes. Navin and Mathur (2016) used the generation of spanning trees to create a predefined number of layouts, and the modified particle swarm optimization algorithm to find the optimal size of the layouts. Alfaisal and Mays (2021) modelled the optimization problem using integer NLP and solved it with the General Algebraic Modelling System. Li and Matthew (1990) used the searching direction method for layout selection and discrete differential dynamic programming for the hydraulic design. Haghighi and Bakhshipour (2015) applied the Loop-by-Loop Cutting Algorithm to select the layout, followed by tabu search to optimize the cost function of the hydraulic design. Duque et al. (2020) used mixed integer programming (MIP) to find the layout of the network and a shortest path algorithm to find the optimal hydraulic design for the given layout. This methodology has been extended by Saldarriaga et al. (2021), who included topographic criteria in the layout selection, which proved to be effective in finding layouts that resulted in lower-cost hydraulic designs.

Space splitting convexification: a local solution method for nonconvex optimal control problems

View Article

Journal Information

Published in International Journal of Control, 2023

Tadeas Sedlacek, Dirk Odenthal, Dirk Wollherr

Differential dynamic programming initially linearises the system equations around a nominal trajectory. Starting from the terminal state, a backward pass identifies the optimal controller gains of a linear quadratic regulator at each time step using the linearised system. Subsequently, a forward pass computes new nominal trajectories via numerical integration using the controller gains previously computed in the backward pass. This procedure is repeated until convergence. Iterative linear quadratic regulators are a variant of differential dynamic programming (Mayne, 1973). The main difference is that only first-order derivatives are used for the approximation of the system dynamics via Taylor expansion instead of second-order ones which decreases computation time (Tassa et al., 2012). Constrained versions of this optimisation technique have been presented in Tassa et al. (2014), Xie et al. (2017) and Chen et al. (2017).