A Real-Time Gradient Method for Nonlinear Model Predictive Control

Model predictive control (MPC) is a modern control scheme that relies on the solution of an optimal control problem (OCP) on a receding horizon. MPC schemes have been developed in various formulations (regarding continuous/discrete-time systems, finite/infinite horizon length, terminal set/equality constraints, etc.). Comprehensive overviews and references on MPC can, for instance, be found in Diehl et al. (2009); Grune & Pannek (2011); Kothare & Morari (2000); Mayne et al. (2000); Rawlings & Mayne (2009).

shown that exponential stability of the closed-loop system as well as exponential decay of the suboptimality can be guaranteed if the number of iterations per sampling step satisfies al o w e rb o u n d . The decay of the suboptimality also illustrates the incremental improvement of the MPC scheme.
Based on these theoretical considerations , this chapter presents a real-time MPC scheme that relies on the gradient method in optimal control (Dunn, 1996;Nikol'skii, 2007). This algorithm is particularly suited for a real-time implementation, as it takes full advantage of the MPC formulation without terminal constraints. In addition, the gradient method allows for a memory and time efficient computation of the single iterations, which is of importance in order to employ the MPC scheme for fast dynamical systems.
In this chapter, the gradient-based MPC algorithm is described for continuous-time nonlinear systems subject to control constraints. Starting from the general formulation of the MPC problem, the stability properties in the optimal MPC case are summarized before the suboptimal MPC strategy is discussed. As a starting point for the derivation of the gradient method, the necessary optimality conditions for the underlying OCP formulation without terminal constraints are derived from Pontryagin's Maximum Principle. Based on the optimality conditions, the gradient algorithm is described and its particular implementation within a real-time MPC scheme is detailed. The algorithm as well as its properties and incremental improvement in the MPC scheme are numerically investigated for the double pendulum on a cart, which is a benchmark in nonlinear control. The simulation results as well as the CPU time requirements reveal the efficiency of the gradient-based MPC scheme.

MPC formulation
We consider a nonlinear continuous-time system of the forṁ with the state x ∈ R n and the control u ∈ R m subject to the control constraints Without loss of generality, we assume that the origin is an equilibrium of the system (1)with f (0, 0)=0. Moreover, the system function f is supposed to be continuously differentiable in its arguments. This section summarizes the MPC formulation as well as basic assumptions and basic results for the stability of the MPC scheme in closed-loop.

Optimal control problem
For stabilizing the origin of the system (1), an MPC scheme based on the following optimal control problem (OCP) is used min u∈U [0,T] J(x k ,ū)=V(x(T)) + The initial condition x(t k )=x k in (3b) denotes the measured (or observed) state of the system (1)a tt i m et k = t 0 + k∆t with the sampling time ∆t. The bared variablesx(τ), u(τ) represent internal variables of the controller with the MPC prediction time coordinate τ ∈ [0, T] and the horizon length T ≥ ∆t.
The integral and the terminal cost functions in (3a) are assumed to be continuously differentiable and to satisfy the quadratic bounds To obtain a stabilizing MPC feedback law on the sampling interval [t k , t k+1 ),thefirstpartof the optimal controlū * k (τ) is used as control input for the system (1) which can be interpreted as a nonlinear "sampled" control law with κ(0; x k )=0. In the next MPC step at time t k+1 ,O C P( 3) is solved again with the new initial condition x k+1 .I nt h e absence of model errors and disturbances, the next point x k+1 is given by x k+1 =x * k (∆t) and the closed-loop trajectories are

Domain of attraction and stability
The following lines summarize important results for the "optimal" MPC case without terminal constraints, i.e. when the optimal solution (6)o fO C P( 3) is assumed to be known in each sampling step. These results are the basis for the suboptimal MPC case treated in Section 3. Some basic assumptions are necessary to proceed: For every x 0 ∈ R n and u ∈U [0,T] , the system (1) has a bounded solution over [0, T].
Assumption 2. OCP (3) has an optimal solution (6) for all x k ∈ R n .
Since u is constrained, Assumption 1 is always satisfied for systems without finite escape time. Moreover, note that the existence of a solution of OCP (3) in Assumption 2 is not very restrictive as no terminal constraints are considered and all functions are assumed to be continuously differentiable. 1 .

Will-be-set-by-IN-TECH
An MPC formulation without terminal constraints has been subject of research by several authors, see for instance ; Ito & Kunisch (2002); Jadbabaie et al. (2001); Limon et al. (2006); Parisini & Zoppoli (1995). Instead of imposing a terminal constraint, it is often assumed that the terminal cost V represents a (local) Control Lyapunov Function (CLF) on an invariant set S β containing the origin.

Assumption 3. There exists a compact non-empty set S
There exist several approaches in the literature for constructing a CLF as terminal cost, for instance Chen & Allgöwer (1998);Primbs (1999). In particular, V(x) can be designed as a quadratic function V(x)=x T Px with the symmetric and positive definite matrix P following from a Lyapunov or Riccati equation provided that the linearization of the system (1)a b o u t the origin is stabilizable.
An important requirement for the stability of an MPC scheme without terminal constraints is to ensure that the endpoint of the optimal state trajectoryx * k (T) reaches the CLF region S β . The following theorem states this property more clearly and relates it to the overall stability of the (optimal) MPC scheme.

The optimal cost satisfies
4. The origin of the system (1) under the optimal MPC law (7) is asymptotically stable in the sense that the closed-loop trajectories (8) satisfy lim t→∞ ||x(t)|| = 0.
The single statements 1-4 in Theorem 1 are discussed in the following: 1. The sublevel set Γ α defines the domain of attraction for the MPC scheme without terminal constraints Limon et al., 2006). The proof of this statement is given in Appendix A.
2. Although α in (10) leads to a rather conservative estimate of Γ α due to the nature of the proof (see Appendix A), it nevertheless reveals that Γ α can be enlarged by increasing the horizon length T. 3. The decrease condition (11) for the optimal cost at the next point x k+1 =x * k (∆t) follows from the CLF property (9)o nt h es e tS β (Jadbabaie et al., 2001). Indeed, consider the trajectorieŝ is the state trajectory that results from applying the local CLF lawū q (τ)=q(x q (τ)).N o t et h a tx q (τ) ∈ S β for all τ ≥ 0, i.e. S β ist positive invariant due to the definition of S β and the CLF inequality (9) that can be expressed in the form Hence, the following estimates hold 4. Based on (11), Barbalat's Lemma allows one to conclude that the closed-loop trajectories (8) satisfy lim t→∞ ||x(t)|| = 0, see e.g. Chen & Allgöwer (1998);Fontes (2001). Note that this property is weaker than asymptotic stability in the sense of Lyapunov, which can be proved if the optimal cost J * (x k ) is continuously differentiable (Findeisen, 2006;Fontes et al., 2007).

Suboptimal MPC for real-time feasibility
In practice, the exact solution of the receding horizon optimal control problem is typically approximated by a sufficiently accurate numerical solution of a suitable optimization algorithm. If the sampling time ∆t is large enough, this numerical approximation will be sufficiently close to the optimal MPC case considered in the last section. However, for large-scale systems or highly dynamical systems, an accurate near-optimal solution often cannot be determined fast enough. This problem, often encountered in practice, gives rise to suboptimal MPC strategies, where an approximate solution is computed in each sampling step. This section develops the necessary changes and differences to the ideal case due to an incremental solution of the underlying OCP solution for a class of optimization algorithms.

Suboptimal solution strategy
Several suboptimal MPC strategies were already mentioned in the introduction DeHaan & Guay, 2007;Diehl et al., 2005;Lee et al., 2002;Michalska & Mayne, 1993; S c o k a e r te ta l . , 1999). Moreover, a suboptimal MPC scheme without terminal constraints -as considered in this chapter -was investigated in .

www.intechopen.com
Instead of relying on one particular optimization method, it is assumed in  that an optimization algorithm exists that computes a control and state trajectorȳ in each iteration j while satisfying a linear rate of convergence with a convergence rate p ∈ (0, 1) and the limit lim j→∞ J(x k ,ū In the spirit of a real-time feasible MPC implementation, the optimization algorithm is stopped after a fixed number of iterations, j = N, and the first part of the suboptimal control trajectoryū to the system (1). In the absence of model errors and disturbances the next point x k+1 is given by x k+1 =x (N) k (∆t) and the closed-loop trajectories are Compared to the "optimal" MPC case, where the optimal trajectories (6) are computed in each MPC step k, the trajectories (14) are suboptimal, which can be characterized by the optimization error In the next MPC step, the last controlū where the last part ofū (0) k+1 is determined by the local CLF feedback law. The goal of the suboptimal MPC strategy therefore is to successively reduce the optimization error ∆J (N) (x k ) in order to improve the MPC scheme in terms of optimality. Figure 1 illustrates this context.

Stability and incremental improvement
Several further assumptions are necessary to investigate the stability and the evolution of the optimization error for the suboptimal MPC scheme. (7) is locally Lipschitz continuous.  Fig. 1. Illustration of the suboptimal MPC implementation.

Assumption 4. The optimal control law in
Assumption 6 is always satisfied for linear systems with quadratic cost functional as proved in Appendix B. In general, the quadratic growth property in Assumption 6 represents a smoothness assumption which, however, is weaker than assuming strong convexity (it is well known that strong convexity on a compact set implies quadratic growth, see, e.g., Allaire (2007) and Appendix B). 2 The stability analysis for the suboptimal MPC case is more involved than in the "optimal" MPC case due to the non-vanishing optimization error ∆J (N) (x k ). An important question in this context is under which conditions the CLF region S β can be reached by the suboptimal state trajectoryx The following theorem addresses this question and also gives sufficient conditions for the stability of the suboptimal MPC scheme.
Theorem 2 (Stability of MPC scheme -suboptimal case). Suppose that Assumptions 1-6 are satisfied and consider the subset of the domain (10) Then, there exists a minimum number of iterationsN ≥ 1 and a maximum admissible optimization error ∆Ĵ ≥ 0, such that for all x 0 ∈ Γα and all initial control trajectoriesū The proof of Theorem 2 consists of several intermediate lemmas and steps that are given in details in . The statements 1-4 in Theorem 2 summarize several important points that deserve some comments.
is not convex on this interval.

A Real-Time Gradient Method for Nonlinear Model Predictive Control
www.intechopen.com 1. The reduced size of Γα compared to Γ α is the necessary "safety" margin to account for the suboptimality of the trajectories (6) characterized by ∆J (N) (x k ). Thus, the domain of attraction Γα together with an admissible upper bound on the optimization error guarantees the reachability of the CLF region S β .
2. An interesting fact is that it can still be guaranteed that Γα is at least as large as the CLF region S β provided that the horizon time T satisfies a lower bound that depends on the quadratic estimates (5) of the integral and terminal cost functions. It is apparent from the bound T ≥ ( 4M V m V − 1) M V m l that the more dominant the terminal cost V(x) is with respect to the integral cost function l(x, u), the larger this bound on the horizon length T will be.
3. The minimum number of iterationsN for which stability can be guaranteed ensuresroughly speaking -that the numerical speed of convergence is faster than the system dynamics. In the proof of the theorem , the existence of the lower boundN is shown by means of Lipschitz estimates, which usually are too conservative to be used for design purposes. For many practical problems, however, one or two iterations per MPC step are sufficient to ensure stability and a good control performance.
4. The exponential reduction of the optimization error ∆J (N) (x k ) follows as part of the proof of stability and reveals the incremental improvement of the suboptimal MPC scheme over the MPC runtime.

Gradient projection method
The efficient numerical implementation of the MPC scheme is of importance to guarantee the real-time feasibility for fast dynamical systems. This section describes the well-known gradient projection in optimal control as well as its suboptimal implementation in the context of MPC.

Optimality conditions and algorithm
The MPC formulation without terminal constraints has particular advantages for the structure of the optimality conditions of the OCP (3). To this end, we define the Hamiltonian with the adjoint state λ ∈ R n . Pontryagin's Maximum Principle 3 states that ifū * k (τ), τ ∈ [0, T] is an optimal control for OCP (3), then there exists an adjoint trajectoryλ * andū * k (τ) minimizes the Hamiltonian for all times τ ∈ [0, T],i.e.
A Real-Time Gradient Method for Nonlinear Model Predictive Control 9 The functions H x and V x denote the partial derivatives of H and V with respect to x.T h e minimization condition (24) also allows one to conclude that the partial derivative H u = [H u,1 ,...,H u,m ] T of the Hamiltonian with respect to the control u =[u 1 ,...,u m ] T has to satisfy The adjoint dynamics in (23) possess n terminal conditions which is due to the free endpoint formulation of OCP (3). This property is taken advantage of by the gradient method, which solves the canonical BVP (22)-(23) iteratively forward and backward in time. Table 1 summarizes the algorithm of the gradient (projection) method.  -Set convergence tolerance ε J (e.g. ε J = 10 −6 ) -Choose initial control trajectoryū

The search directions
2) Gradient step: While j ≤ N Do -Integrate backward in timėλ -Compute the search direction -Compute the step size α (j) k by (approximately) solving the line search problem α -Compute the new control trajectorȳ -Integrate forward in timė Otherwise set j ← j + 1 and return to 2).  (28) represents a projection function of the form which guarantess the adherence of the input constraints [u − , u + ]. For the real-time implementation within a suboptimal MPC scheme, the line search problem (28)canbesolved in an approximate manner (see Section 4.2). Finally, the control trajectoryū follows from evaluating (29)withs (j) k (τ) and the step size α (j) k . The convergence properties of the gradient (projection) method are investigated, for instance, in Dunn (1996); Leese (1977);Nikol'skii (2007). In particular, Dunn (1996)p r o v e du n d e r certain convexity and regularity assumptions that the gradient method exhibits a linear rate of convergence of the form (15).

Adaptive line search
The line search (28) represents a scalar optimization problem that is often solved approximately. The most straightforward way is to use a fixed step size α throughout all gradient iterations. This, however, usually leads to a slow rate of convergence.
An attractive alternative to a constant step size is to use a polynomial approximation with an underlying interval adaptation. To this end, the cost functional J x k , ψ(ū (28) is evaluated at three sample points that are used to construct a quadratic polynomial approximation g(α) of the form The coefficients c 0 , c 1 , c 2 are obtained by solving the set of equations with the explicit solution

A Real-Time Gradient Method for Nonlinear Model Predictive Control 11
If c 2 > 0, then the polynomial g(α) has a minimum at the point If in additionα lies inside the interval [α 1 , α 3 ],t h e nα = α (j) k approximately solves the line search problem (28). Otherwise, α (j) k is set to one of the interval bounds α 1 or α 3 .I n t h i s case, the interval [α 1 , α 3 ] can be adapted by a scaling factor to track the minimum point of the line search problem (28) over the single gradient iterations. Table 2 summarizes the overall algorithm for the approximate line search and the interval adaptation.
In general, the gradient method in Table 1 is stopped if the convergence criterion is fulfilled for some tolerance ε J > 0. In practice this can lead to a large number of iterations that moreover varies from one MPC iteration to the next. In order to ensure a real-time feasible MPC implementation, the gradient algorithm is stopped after N iterations and the re-initialization of the algorithm is done as outlined in Section 3.1.
-Adapt the line search interval [α 1 , α 3 ] for the next gradient iteration according to

Example -Inverted double pendulum
The inverted double pendulum on a cart is a benchmark problem in control theory due to its highly nonlinear and nonminimum-phase dynamics and its instability in the upward (inverted) position. The double pendulum in Figure 2 consists of two links with the lengths l i and the angles φ i , i = 1, 2 to the vertical direction. The displacement of the cart is given by x c . The mechanical parameters are listed in Table 3 together with their corresponding values (Graichen et al., 2007). The double pendulum is used in this section as benchmark example for the suboptimal MPC scheme and the gradient algorithm in order to show its performance for a real-time MPC implementation.

Equations of motion and MPC formulation
Applying the Lagrangian formalism to the double pendulum leads to the equations of motion (Graichen et al., 2007) with the generalized coordinates φ =[φ 1 , φ 2 ] T and the functions The acceleration of the cartẍ c serves as control input u. Thus, the overall model of the double pendulum can be written as the second-order ordinary differential equations (ODE) The acceleration of the cart is limited by the constraints  Table 3. Mechanical parameters of the double pendulum in Figure 2.
The CLF condition in Assumption 3 is approximately satisfied by solving the Riccati equation where A = to account for the fast dynamics of the double pendulum and the highly unstable behavior in the inverted position.

Simulation results
The suboptimal MPC scheme together with the gradient method were implemented as Cmex functions under MATLAB. The functions that are required in the gradient method are  The considered simulation scenario consists of an initial error around the origin (x SP = 0, u SP = 0) and a subsequent setpoint step of 1 m in the cart position at time t = 2s( x SP = [1 m,0,0,0,0,0] T , u SP = 0). Figure 3 shows the simulation results for a two-stage scenario (initial error and setpoint change at t = 2 s). Already the case of one gradient iteration per sampling step (N = 1) leads to a good control performance and a robust stabilization of the double pendulum. Increasing N results in a more aggressive control behavior and a better exploitation of the control constraints (43).
The lower plots in Figure 3 show the (discrete-time) profiles of the cost value J(x k ,ū (N) k ) and of the optimization error ∆J (N) . In order to determine ∆J (N) (x k ), the optimal cost J * (x k ) was computed in each step x k by solving the OCP (3) for the double pendulum with a collocation-based optimization software. It is apparent from the respective plots in Figure 3 that the cost as well as the optimization error rapidly converge to zero which illustrates the exponential stability of the double pendulum in closed-loop and the incremental improvement of the algorithm. It is also seen in these plots that the performance improvement between N = 1andN = 10 iterations per sampling step are comparatively small compared to the increase of numerical load.
To investigate this point more precisely, Table 4 lists the required CPU time for different MPC settings. The computations were performed on a computer with an Intel i7 CPU (M620, 2.67 GHz) 5 , 4 GB of memory, and the operating system MS Windows 7 (64 bit). The overall MPC scheme compiled as Cmex function under MATLAB 2010b (64 bit). All evaluated tests in Table 4 show that the required CPU times are well below the actual sampling time of ∆t = 1 ms. The CPU times are particularly remarkable in view of the high complexity of the nonlinear pendulum model (40)- (42), which illustrates the real-time feasibility of the suboptimal MPC scheme. The last column in Table 4 shows the average cost value that is obtained by integrating the cost profiles in Figure 3 and dividing through the simulation time of 5 s. This index indicates that the MPC scheme increases in terms of control performance for larger numbers of N.
From these numbers and the simulation profiles in Figure 3, the conclusion can be drawn that N = 2 gradient iterations per MPC step represents a good compromise between control performance and the low computational demand of approximately 100 µsperMPCstep.

Conclusions
Suboptimal solution strategies are efficient means to reduce the computational load for a real-time MPC implementation. The suboptimal solution from the previous MPC step is used for a warm-start of the optimization algorithm in the next run with the objective to reduce the suboptimality over the single MPC steps. Section 3 provides theoretical justifications for a suboptimal MPC scheme with a fixed number of iterations per sampling step.
A suitable optimization algorithm is the gradient method in optimal control, which allows for a time and memory efficient calculation of the single MPC iterations and makes the overall MPC scheme suitable for very fast or high dimensional dynamical systems. The control performance and computational efficiency of the gradient method is illustrated in Section 5 for a highly nonlinear and complex model of a double pendulum on a cart. The suboptimal MPC scheme based on a real-time implementation of the gradient method was also experimentally validated for a laboratory crane  and for a helicopter with three degrees-of-freedom (Graichen et al., 2009), both experiments with sampling times of 1-2 milliseconds. The applicability of the gradient-based MPC scheme to high dimensional systems is demonstrated in (Steinböck et al., 2011) for a reheating furnace in steel industry.
The same simplifications can be used for the state-dependent terms in (53) since the linear dynamics (54) ensures that the superposition of two input signals w(t)= 1 2 u(t)+ 1 2 v(t) yield a corresponding superposed state response x w (t)= 1 2 x u (t)+ 1 2 x v (t) with x w (0)=x 0 .H en c e, the right-hand side of (55) can be written as with ∆x(t)=x u (t) − x v (t) and the constant C = λ min (R)/2. Since J(u) is strongly (and therefore also strictly) convex on the convex set U [0,T] , it follows from standard arguments (Allaire, 2007) that there exists a global and unique minimum point u * ∈U [0,T] . Moreover, since U [0,T] is convex, 1 2 (u + u * ) ∈U [0,T] for all u ∈U [0,T] such that J( 1 2 u + 1 2 u * ) ≥ J(u * ). Hence, the strong convexity inequality (55) can be turned into the quadratic growth property This shows that Assumption 6 is indeed satisfied for linear-quadratic OCPs of the form (53).

Acknowledgements
This work was supported by the Austrian Science Fund under project no. P21253-N22. Model Predictive Control (MPC) usually refers to a class of control algorithms in which a dynamic process model is used to predict and optimize process performance, but it is can also be seen as a term denoting a natural control strategy that matches the human thought form most closely. Half a century after its birth, it has been widely accepted in many engineering fields and has brought much benefit to us. The purpose of the book is to show the recent advancements of MPC to the readers, both in theory and in engineering. The idea was to offer guidance to researchers and engineers who are interested in the frontiers of MPC. The examples provided in the first part of this exciting collection will help you comprehend some typical boundaries in theoretical research of MPC. In the second part of the book, some excellent applications of MPC in modern engineering field are presented. With the rapid development of modeling and computational technology, we believe that MPC will remain as energetic in the future.

How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following: Knut Graichen and Bartosz Käpernick (2012 Available from: http://www.intechopen.com/books/frontiers-of-model-predictive-control/a-real-time-gradientmethod-for-nonlinear-model-predictive-control