Section IV. DYNAMICS AND CONTROL¶

Chapter 15. Stabilizing Controlled Dynamical Systems¶

In this chapter we return to the question of control, with a given dynamical system with equations of motion $\dot{x}= f(x,u)$, with $x\in \mathbb{R}^n$ and $u\in \mathbb{R}^m$. We now are faced with both the descriptive question of whether a given control function $u(x)$ is stable, and the prescriptive question of how to generate a stable closed-loop control. This chapter will provide an overview of classical techniques; in the next chapter we shall discuss model-based and predictive methods.

(It should be noted that this book stays entirely in the domain of state-space models, whereas linear control theory texts extensively use the frequency-domain notation induced by the Laplace transform.)

Control philosophies¶

There are many techniques available for controlling a dynamic system. The philosophical approaches behind these techniques can be roughly categorized along three axes: prescriptive/descriptive, model free/model based, and myoptic/predictive.

Descriptive vs prescriptive¶

Descriptive approaches assume that a controller has been given, and the goal is to analyze whether the controlled system satisfies some desired properties. Examples of such properties could be stability, disturbance rejection, or some other performance metric. Empirical testing could involve simply simulating or running the system under many operational conditions and observing the results, but the more sophisticated techniques described below provide stronger guarantees.

Prescriptive approaches, on the other hand, attempt to synthesize a control function given some desired properties. In general synthesis is more difficult than analysis, because any method with performance guarantees on the synthesized controller will also suppose to be able to quantitatively describe the level of performance achieved.

Model-free vs model-based¶

Model-free controllers exhibit operate under no assumptions about the form of $f(x,u)$, besides perhaps general notions of the directions of certain effects, e.g., increasing $u_1$ will result in a general increase in $x_1$. These controllers exhibit some degree of functionality regardless of how accurately the dynamics function has been determined and are hence easier to apply at first glance to a poorly characterized system. The use of feedback is essential to guide the state toward a desired point. Usually, they are also less computationally expensive and are almost certainly myopic rather than predictive. However, they typically require extensive tuning to achieve desired levels of performance.

One simple example is a standard thermostat, which operates under the assumption that turning on the heat will increase the temperature of a room, and turning on air conditioning will decrease temperature. It is not necessary to know the exact rate at which heating / cooling affects temperature.

Model-based controllers by contrast use a mathematical or computational model of $f(x,u)$ in order to output the control $u$. These methods can largely attain superior performance than model-free approaches, but at greater difficulty at identifying a model of $f$, and greater computational expense. An example would be an advanced thermostat which measures internal and external temperatures and regulates the rate of heating / cooling to minimize energy consumption, e.g., not heating the room when the external temperature is higher than desired. At an extreme, a feedforward, open-loop controller simply calculates a trajectory of controls $u(t)$ and executes them without the use of state feedback.

Myoptic vs predictive¶

Myoptic (aka greedy) methods reason about controls only looking at a single instant in time. At most, a myopic method will examine the direction of motion in state space after the control is chosen at the current point in time. The time evolution of the system into the future is not explicitly considered.

Predictive methods attempt to extrapolate the current state into the future under some hypothesized controls. As a result they must have a model of the dynamics in order to simulate potential solution trajectories. Predictive methods are also typically more computationally expensive than myopic methods, but can obtain much better performance. Usually, these methods have a reconfigurable parameter called the horizon, which specifies the duration of time into the future for which predictions will be made. The choice of horizon induces a tradeoff, with longer horizons resulting in increased performance but also greater computational expense.

PID Control¶

Proportional-Integral-Derivative (PID) control is the workhorse of 1D control systems. It is the most widely studied class of controllers due to its simplicity, and is certainly the first thing one might try on any given new system. Despite the fact that it is model-free and myopic, it can achieve good performance with some amount of manual tuning.

Definition¶

We are given a 1D problem with 1D control. Perhaps little is known about the dynamics $f(x,u)$ except that a positive control $u$ generally acts to increase $x$, and a negative control acts to decrease it. Suppose the desired state (the setpoint) is $x_d$.

PID control is a sum of three terms, weighted with gain constants $k_P$, $k_I$, and $k_D$. The first proportional term is defined as: $$u_P = -k_P (x - x_d).$$ The purpose of this term is to act as a "spring" that guides the state toward $x_d$ (Fig. 1).

(a) Normal value of $k_P$	(b) $k_P$ reduced by 1/3

Figure 1. Error plotted as a function of time for a damped system under proportional (P) control. Left: The P term acts like a spring to pull the error toward 0. Right: The magnitude of the gain $k_P$ influences the stiffness of the "spring." Here $k_P$ is reduced by a factor of 3, leading to a slower recovery. Arrows indicate the magnitude of the $P$ term.

The term $x - x_d$ is the error of the current state, and the gain term $k_P > 0$ dictates the strength of the spring. Larger values of $k_P$ producing stronger (stiffer) springs, which drives the system faster toward 0. Stiffer systems can be more precise than softer ones. However, stiffer systems can sometimes:

Overshoot a desired setpoint if not tuned correctly.
Go unstable when used with a discrete-time controller with large time step.
Generate extreme controls, which may cause motor damage, loss of comfort in vehicles, and may be more dangerous to humans and objects in the environment.

If the control consists of purely a proportional term, the system runs a risk of arriving at a nonzero steady state error. A steady state is a value $x$ and control $u$ such that the dynamics are 0, that is, $f(x,u) = 0$. We would hope that $x_d$ is a steady state, but this might not hold where there is a biasing effect in the dynamics, such that $f(x_d,0)\neq 0$. For example, a robot arm holding a load is going to observe a bias due to the force of gravity. More specifically, suppose a proportional controller $u = -k_P x$ with setpoint 0 is given to the the system $\dot{x} = a x + b + u$ where $a$ and $b$ are nonzero constants. Then, replacing $u$ into the dynamics equation and setting $\dot{x}=0$, we observe that a steady state exists at the value $x = b/(k_P-a)$. If we wish to drive this error toward 0, we could perhaps make $k_P$ very large... but there is a better way!

The integral term is designed to counteract steady state errors (Fig. 2). It is defined as a function of the integral error $I(t)$ as follows: $$\begin{gathered} u_I = -k_I I(t) \\ I(t) = \int_0^t (x(t)-x_d(t)) dx \end{gathered}$$ Setting $u = u_P + u_I$ helps handle steady state error because the integral error is a function that grows positively if the steady state error is positive, and grows negatively if the steady state error is negative. When steady state error is positive, the value of $u_I$ becomes increasingly negative to the point where the error should start to be pushed toward 0. The same is true when steady state error is negative.

(a) P control only	(b) PI control

Figure 2. Left: When the system exhibits bias, using only a P term leads to a steady-state error. Right: Adding an integral (I) term lets the system compensate for bias. Arrows indicate the magnitude of the $P$ term and $I$ term.

It is strange to see an integral in a control function, and on first glance it would appear contradictory to our claim that PID control is simple to implement! However, it is straightforward to approximate an integral at a fixed control frequency using a running total $$I(t) = I(t-\Delta t) + \Delta t (x(t) - x_d(t))$$ where $\Delta t$ is the control period (the reciprocal of the control frequency). So, the controller simply stores the current value of $I(t)$ and updates it each control period.

(a) P control only	(b) PD control

Figure 3. A controlled system with second order dynamics. (a) Using P control leads to oscillations, and is unstable or lightly damped. (b) the D term adds a damping effect to reduce oscillations. Arrows indicate the magnitude of the $P$ term and $D$ term.

The final derivative term is designed to accommodate second-order systems, which oscillate under the P and I terms only unless they are naturally damped (Fig. 3). The D term adds a form of damping which helps stabilize such systems. Moreover, we can specify a desired velocity $\dot{x}_d$ so that the D term can help the system track desired changes faster than the P term alone. The expression for the D term is $$u_D = -k_D (\dot{x} - \dot{x}_d).$$

Adding $u = u_P + u_I + u_D$ provides the PID controller.

PID gain tuning¶

Step change

Overshoot

Frequency response

Stability analysis¶

Now we wish to perform stability analysis of a given controller in order to determine whether the system in the long run obeys stability (errors stay bounded), divergence (errors grow), or convergence (errors shrink toward 0). The general method for performing stability analysis is to assume that we have a (simple) system model and substitute the definition of the control into the equations of motion to obtain an ordinary differential equation (ODE).

To introduce this approach, we start by considering a first order LTI system without drift $$\dot{x} = a x + b u$$ and a P controller with setpoint at 0 $$u = -k_P x.$$ Substituting the control into the dynamics, we obtain $$\dot{x} = (a-bk_P) x.$$ This is now a linear ODE, which has solutions of the form $c_0 e^{c_1 t}$. Matching the coefficient of the ODE and the initial condition $x(0)$, we obtain the solution trajectory $$x(t) = x(0) e^{t(a-bk_P)}.$$

The stability of the system is therefore determined entirely by the exponent $a-bk_P$. If $a-bk_P > 0$, then the system will diverge from 0 over time. If $a-bk_P = 0$, then the system will not move, and if $a-bk_P < 0$, then the system will converge to 0 over time. Moreover, the speed of convergence depends on its value, with faster convergence for more strongly negative values. This then suggests that if $b$ is positive (matching our alignment assumption) then faster convergence is achieved by making $k_P$ larger.

Let us now study an LTI system with a drift term $$\dot{x} = a x + b u + c$$ with the PI controller $$u = -k_P x - k_I I.$$ Substituting the control into the dynamics, we obtain $$\dot{x} = (a-bk_P) x - b k_I I + c$$ If we take the time derivative of both sides, we obtain $$\ddot{x} = (a-bk_P) \dot{x} - b k_I x$$ where we have noted that $\dot{I}(t) = x(t)$. It turns out that the solution to this ODE is a damped harmonic oscillator, which has generally five possible solution classes: divergent, undamped, overdamped, underdamped, and critically damped.

To derive these solutions, we again can use a basis of exponential functions of the form $e^{c_1 t}$. This requires solving the characteristic equation $$c_1^2 = (a-bk_P) c_1 - b k_I$$ to obtain the two solutions (to a quadratic equation) $$c_1 = d \pm \sqrt{d^2 - b k_I}$$ with $d=\frac{a-bk_P}{2}$. Let us call the two solutions $c_1^-$ and $c_1^+$. The resulting general solution is: $$x(t) = c_0^- e^{t c_1^-} +c_0^+ e^{t c_1^+}$$ with matching initial conditions $$\begin{gathered} x(0) = c_0^- + c_0^+ \\ \dot{x}(0) = (a-bk_P) x(0) + c = c_0^- c_1^- + c_0^+ c_1^+ \end{gathered}$$ which is a linear system of equations in $c_0^-$ and $c_0^+$: $$\begin{bmatrix}{x(0)}\\{\dot{x}(0)}\end{bmatrix} = \begin{bmatrix}{1}&{1}\\{c_1^-}&{c_1^+}\end{bmatrix} \begin{bmatrix}{c_0^-}\\{c_0^+}\end{bmatrix}.$$ This can be solved using matrix inversion except for the case where $c_1^- = c_1^+$, which we shall ignore for just a moment. First, let us examine whether $x(t)$ converges to 0 as $t \rightarrow \infty$, which requires the exponential terms to converge to 0. This requires that $d = \frac{a-bk_P}{2} < 0$. If, on the other hand, $d > 0$ we have a divergent condition, and if $d=0$ we have an undamped condition.

Next, let us examine the overdamped case in which $b k_I < d^2$, giving the two solutions $c_1^- = d - \sqrt{d^2 - b k_I}$ and $c_1^+ = d + \sqrt{d^2 - b k_I}$. We would need $-d > \sqrt{d^2 - b k_I}$ to prevent $c_1^+$ from going positive, but if we assume $b > 0$ then this will certainly hold for all $d < 0$. Overall, in the overdamped case $x(t)$ is a linear combination of two exponentially decreasing functions with different rates.

Returning to the case where $c_1^- = c_1^+$, the ODE has a different solution trajectory class $x(t) = c_0 \exp(t c_1) + c_0^\prime t \exp(c_1 t)$. If $c_1<0$ then the system converges and is known as critically damped.

The last underdamped case where $d < 0$ and $b k_I > d^2$ is interesting because the term inside the square root becomes negative. In this situation, the two solution coefficients $c_1^-$ and $c_1^+$ become complex numbers with an imaginary component. (This may sound strange at first, but eventually everything will work out!)

To derive this, we use the following equation which can be derived via Euler's identity: $$e^{(a + bi)t} = e^{at}(\cos (bt) + i \sin (bt)).$$ Hence, setting $\omega = \sqrt{b K_i - d^2}$ and substituting this into the $$\begin{aligned} x(t) &= c_0^- e^{t d}(\cos (-\omega t) + i \sin(-\omega t)) + c_0^+ e^{t d} (\cos(\omega t) + i \sin (\omega t)) \\ &= e^{t d} ((c_0^-+c_0^+) \cos (\omega t) + i (c_0^+-c_0^-) \sin (\omega t)) \end{aligned}$$ Since the initial condition has 0 imaginary component, the imaginary component of $c_0^-+c_0^+$ must be 0, while the real component of $c_0^+-c_0^-$ must be 0. Ultimately, we have the solution $$x(t) = e^{t d} (x(0) \cos (\omega t) + \frac{\dot{x}(0)}{\omega} \sin(\omega t)).$$ This oscillatory behavior means that for any nonzero starting state, the oscillation will overshoot the setpoint. Moreover, the frequency of oscillation $\omega^2 = bk_I - (a-bk_P)^2/4$ depends on both gain coefficients and the coefficients of the system. Lower values of $k_I$ will reduce and eventually eliminate oscillation, at the cost of potentially slower recovery from steady state error.

Applying similar analysis of the PD control problem for a second order system similarly leads to the damped harmonic oscillator system, and is left as an exercise. The analysis for the PID case is a bit more complex, involving a third-order linear system. We shall see more general methods for solving these systems in Sec. 4.1.

Summary¶

For a system with measured state $x$ and derivative $\dot{x}$, desired setpoint $x_d$ and derivative $\dot{x}_d$ (optional; can be set to 0), and fixed control frequency $1/\Delta t$, a PID controller calculates the control $u$ using the following update equations: $$\begin{gathered} I \gets I + \Delta t (x - x_d) \\ u \gets -k_P (x - x_d) - k_I I - k_D (\dot{x} - \dot{x}_d). \end{gathered}$$ Here $k_P > 0$, $k_I \geq 0$, and $k_D \geq 0$ are the control gains, which are chosen to be positive under the alignment assumption.

Practical issues include:

Control saturation
Derivative estimation
Integral wind-up
Finite control frequency
Time delay

Control saturation occurs when the control is subject to bounds $u_{min} \leq u \leq u_{max}$, like minimum/maximum torque, or minimum/maximum throttle. If the PID control produces a value of $u$ outside the bounds, it should be clamped to produce a valid value. This clamping will slow down convergence and in some cases, cause instability.

Derivative estimation errors are a problem because derivatives may be approximated using finite differencing: $\dot{x} \approx \frac{x(t)-x(t-\Delta t)}{\Delta t}$. However, since $\Delta t$ is small and in the denominator, the derivative is more sensitive to measurement noise than position estimates. This causes the $D$ term to vary, which in turn causing the control to track less precisely and in an irregular fashion. To improve derivative estimation accuracy, a low-pass filter is typically applied to the signal $x(t)$.

Integral wind-up occurs when the control cannot be made large enough to counteract a steady-state error, such as when the control is saturated. In this case, $I$ proceeds toward infinity, which then causes delays responding to changes in setpoint. Suppose the system reaches a steady state with positive error due to control limits, causing $I$ to grow large over time. If the setpoint changes to a more reasonable value that can actually be reached, the control still remains dominated by $I$, and the system will not budge until the high positive value of $I$ is "erased" by negative errors $x-x_d$. This problem is typically addressed by capping the magnitude of $I$ to ensure bounded controls and sufficiently fast response.

Finite control frequency can cause instability when gains are stiff in a manner similar to errors in Euler integration. The control computed at one time step is applied for $\Delta t$ time in the continuous dynamics of the system. The controller cannot respond to changes faster than this rate. Hence, if $\Delta t$ is large, the system could be forced to a state with worse error than the previous time step, leading to instability. Similarly, time delays in state measurement can cause the controller to be counteracting errors at a past time, which may no longer exist in reality.

Applying PID to coupled systems¶

An industrial robot can be considered as a $n$-D second order dynamical system with an $n$-D control, one for each joints. We can observe that each element of the control is aligned with the corresponding element of the state, which means that it could be possible to apply a PID controller separately for each joint. This is not a bad approach for stiff, highly-geared industrial robots.

However, as we have seen in the prior chapter, each movement of a joint has inertial effects on the other joints. This is known as coupling in the dynamics. Due to this coupling, robot arms with softer PID gains may end up performing unstable or even chaotic motion with this approach, and the robot will not necessarily track paths in an optimal way. The PID approach may also perform quite poorly for vehicles when the disturbance from the setpoint is large.

For example, consider the simple coupled LTI system described in Fig. 4, with the initial error $(1,0)$. If the system were decoupled, the controller for $u_1$ would certainly dampen out the error. But due to coupling in the system, the error in $x_2$ starts to increase, which starts to also increase the negative velocity in $x_1$, and so on. When $k_D=2$, the controller dampens out the error quickly, only overshooting once. When $k_D=1$, the controller spirals around the origin multiple times but is ultimately stable. When $k_D=0.5$, the controller ends up in an unstable trajectory.

fig:PIDCoupled

Figure 4. Element-wise PID control applied to a coupled system $\ddot{x}_1 = -0.4 x_2 + u_1$, $\ddot{x}_2 = 0.4 x_1 + u_2$. Trajectories starting from $x=(1,0)$ are shown for $k_P=1$, $k_I=0.5$, and different values of $k_D$ are shown.

Simulation analysis¶

Frequency response

Step change

Monte-Carlo analysis

Analyzing stability and convergence¶

First we shall describe methods for analyzing the stability of a dynamic system under a given closed loop control. The most well-understood methods for doing so are in linear time-invariant (LTI) systems, in which solution trajectories can be calculated analytically. In nonlinear systems, solution trajectories are harder to come by, and methods for analyzing stability are usually restricted to the local neighborhood of a single point in state space, and often require some art to apply successfully.

Without loss of generality, we will assume that the desired equilibrium point $x^\star$ lies at the origin, $x^\star =0$. If $x^\star \neq 0$, we can easily recenter the problem by defining the dynamics in terms of the error $e = x - x^\star$. Hence $\dot{e} = g(e,u) \equiv f(e+x^\star,u)$ is a dynamical system that is essentially equivalent to the original one, but has an equilibrium point at 0.

Stability in Linear Time Invariant systems¶

Recall that LTI systems are given by equations of the form $$\dot{x} = Ax + Bu$$ where $A$ and $B$ are constant matrices of size $n \times n$ and $n \times m$, respectively. We shall consider stability under a closed-loop control function of the form $u=-Kx$, where $K$ is an $m \times n$ gain matrix.

As an aside, one might question whether this is the only appropriate form of the control. Note that it is certainly necessary that $u=0$ when $x=0$, because otherwise equilibrium would not hold at 0. Also, any smooth nonlinear control function $u(x)$ can be approximated by its Taylor series at $x=0$, leading to an expression of the form $u(x) = \frac{\partial u}{\partial x}x + O(\|x\|^2)$. In this case, $K$ is the negative of its Jacobian. Hence, the results derived here will apply approximately in a neighborhood of the equilibrium point.

It may not be clear at first glance, but this class of controller actually subsumes PID control! To observe this, for any given second-order single-input single-output system $\ddot{x} = ax+b\dot{x}+cu$ we build an equivalent 1-input 3-output system $y = (I,x,\dot{x})$ such that

$$ \dot{y} = \begin{bmatrix}x\\ \dot{x}\\ \ddot{x} \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & a & b \end{bmatrix} y + \begin{bmatrix}0\\0\\c\end{bmatrix} u $$

Then, we can define a the 3$\times$1 matrix $K$ as $$K=\begin{bmatrix}k_I & k_P & k_D\end{bmatrix}$$ so that $u=-Kx$ is the PID control.

With this form of closed loop control, we can write the velocity of the state in terms of a matrix times $x$: $$\dot{x} = Ax - BKx = (A-BK)x.$$ Hence, the matrix $A-BK$ is of primary importance.

The set of solutions to such linear ODEs are of the form: $$x(t) = \sum_{k=1}^n c_k v_k e^{t\lambda_k}$$ where $\lambda_k \in \mathbb{C}$ is an eigenvalue of $A-BK$ and $v_k \in \mathbb{C}^n$ is an eigenvector. The coefficients $c_k \in \mathbb{C}$ are chosen to match the initial condition $x(0)$. The eigenvalues are also known as the poles of the system.

Recall that an eigenvalue / eigenvector pair $(\lambda_k, v_k)$ satisfy the conditions that $(A - BK)v_k = \lambda_k v_k$, and $v_k \neq 0$. Eigenvectors corresponding to distinct eigenvalues will be orthogonal, and if a subspace of eigenvectors exists for a single eigenvalue, then an orthogonal basis for this subspace can be determined. An $n \times n$ matrix can have most $n$ distinct eigenvalues.

To verify the solution, it is a simple matter to see that $$\dot{x}(t) = \sum_{k=1}^n c_k \lambda_k v_k e^{t\lambda_k} = \sum_{k=1}^n c_k (A-BK)v_k e^{t\lambda_k} = (A-BK) x(t)$$ as desired. (If all eigenvalues are distinct, this equation is an expression of all solution trajectories. The process of proving this is a bit more involved, and hence we omit it here.)

Note that because $A-BK$ is not generally a symmetric matrix, all of these quantities may be complex numbers. This is at first glance an odd result, since state space trajectories should not have imaginary components! Ultimately, we shall see that the form of the eigenvalues permits real-valued solution trajectories.

Let us examine a single time-varying basis function $e^{t\lambda}$. Since $\lambda$ is potentially complex, we can write it in terms of its real component $a$ and its complex component $b$, i.e., $\lambda = a + b i$. Then, $e^{t\lambda} = e^{ta}e^{itb}$. Using Euler's formula, $e^{ix} = \cos x + i \sin x$, we see that $$e^{t\lambda} = e^{ta}(\cos tb + i \sin tb)$$ The portion of this expression in parentheses oscillates at frequency $b$ and has magnitude 1, so the exponent term is the one that regulates the convergence of the basis function as $t \rightarrow \infty$. The first thing to note is that if $a > 0$, then the basis function is unstable. If $a = 0$, then the basis function oscillates forever. If $a < 0$, then the basis function decays to 0, and moreover the more negative $a$ is, the faster the decay. As a result, we can state the following principle:

A control $u=-Ku$ of an LTI system is convergent if all poles of $A-BK$ have negative real component; is stable if all poles have non-positive real component; and is unstable if any pole has positive real component.

As a result, it is customary to plot the poles of an LTI system as points on the complex plane. Verifying that the system is stable is a matter of checking whether the poles lie on the left half of the complex plane. The speed of convergence is also dictated by the distance from the vertical axis. It is also simple to visually inspect the oscillatory behavior of solutions by examining their imaginary components: as the imaginary magnitude increases, so does the frequency of oscillation.

Returning to the question of complex eigenvalues, it can be shown that for all complex eigenvalues $\lambda_k = a_k + i b_k$, another eigenvalue will be the complex conjugate $\lambda_j = a_k - i b_k$ formed by negating all of the imaginary components. Moreover, if $v_k$ is the eigenvector corresponding to $\lambda_k$, the eigenvector $v_j$ corresponding to $\lambda_j$ will be defined with entries equal to the complex conjugate of every entry of $v_k$.

We can observe that both the real parts and the imaginary parts of these solutions are real-valued solutions to the original ODE. If $$x(t) = Re(x(t)) + i\cdot Im(x(t))$$ is a complex solution trajectory then $$Re(\dot{x}(t)) + i\cdot Im(\dot{x}(t)) = \dot{x}(t) = Ax(t) = A\cdot Re(x(t)) + i A\cdot Im(x(t)).$$ Since $A$ is real, both the real and imaginary terms must match between the left and right hand sides of the equation. That is, $Re(\dot{x}(t)) = A\cdot Re(x(t))$ and $Im(\dot{x}(t)) = A\cdot Im(x(t))$. As a result, if we find complex coefficients to match the real-valued initial condition, then the solution trajectory will be real-valued for all $t$.

Note that LTI systems with a drift term $$\dot{x} = Ax + Bu + c,$$ with $c \neq 0$ a constant vector do not satisfy the requirements for the above analysis to apply directly. Examples of systems with drift include fixed-wing aircraft, satellites in orbit, and robots in a potential field. However, systems with drift can be rewritten as a driftless LTI system in terms of a modified dynamics equation with a state variable $y = (x,1)$ augmented by a constant 1: $$\dot{y} = \tilde{A} y + \tilde{B} u$$ with $$\tilde{A} = \begin{bmatrix} A & c \\ 0_{1\times n} & 0 \end{bmatrix}$$ and $$\tilde{B} = \begin{bmatrix}{B}\\{0_{1\times m}}\end{bmatrix}$$ (using block matrix notation).

These systems may be stable, but will not be convergent with a control of the form $u=-Kx$ because the $n+1$'th dimension has an eigenvalue with value 0. In some systems we may augment the control with a constant offset $u = -Kx-j$, which may indeed lead to convergence if the offset canceling out the drift term by satisfying $Bj=c$. However, if the drift term cannot be canceled out by the control, then the system is unable to converge, and the best that can be hoped for is convergence to a periodic trajectory (called an orbit, discussed briefly below).

Stability in nonlinear systems¶

Unlike the LTI case, stability analysis for nonlinear systems is largely limited to a neighborhood of the equilibrium point. In other words, we can only hope to prove stability / convergence locally; once the state is disturbed out of that neighborhood no stability guarantees can be made.

Linearization¶

The simplest method for examining the stability of a nonlinear system is to linearize the system about the equilibrium point and apply standard LTI control theory. This approach can be effective for some systems, but due to the use of linearization, it assumes second order effects are negligible. As a result it is only approximate and the approximation worsens as the state deviates from the equilibrium point.

The general method linearizes about the 0 state and the 0 control to determine an LTI system as follows: $$A = \frac{\partial f}{\partial x}(0,0)$$ $$B = \frac{\partial f}{\partial u}(0,0).$$ If the system has drift, $f(0,0) \neq 0$, then a drift vector must also be included in the LTI model $$c = f(0,0).$$ Given a gain matrix $K$, the LTI stability methods above can then be used to determine stability of the system.

However, oftentimes the approximation errors introduced by linearization are so severe that this method fails to verify that a control is stable / convergent. As an example, consider the 1D system: $$\dot{x} = xu.$$ The linearization has the $A$ and $B$ matrices identically 0, hence the pole is at 0 and the LTI analysis determines the system to be stable but not convergent. However, if we set $u = -x$, we see that $\dot{x} = -x^2$ which has the analytical solution $x(t) = 1/(t+C)$. Obviously this converges to 0.

Lyapunov stability¶

Lyapunov's method is the most commonly used method for directly studying the stability of nonlinear systems. The main difficulty of nonlinear systems is that it is challenging to determine a closed-form expression for trajectories given an initial state. The idea behind this method is to prove the existence of a function that 1) is minimal at the equilibrium point, and 2) via the dynamics of the system its value will be monotonically decreasing over time.

First, let us suppose a closed-loop control $u(x)$ is given. We can then rewrite the dynamics of the system only in terms of $x$ as $\dot{x} = g(x) \equiv f(x,u(x))$.

Specifically, suppose we are given a Lyapunov function $V(x):\mathbb{R}^n \rightarrow \mathbb{R}$ with the following properties:

$V(0) = 0$.
$V(x) > 0$ for $x \neq 0$.

We can then examine how the value of the Lyapunov function behaves over time. Taking the time derivative $\frac{d}{dt} V(x(t))$, we get the relation $$\frac{d}{dt} V(x) = \frac{\partial V(x)}{\partial x} \dot{x} = \frac{\partial V(x)}{\partial x} g(x).$$ Now if we can prove that $\frac{\partial V(x)}{\partial x} g(x) \leq 0$ for all $x$ in the neighborhood of the origin $\|x\| \leq \delta$, then the system is Lyapunov stable. Intuitively, for any starting state in the $\delta$-neighborhood of the equilibrium point, $V$ stays bounded, and hence the system state cannot stray too far from the equilibrium point.

If the inequality is replaced by a strict inequality $\frac{\partial V(x)}{\partial x} g(x) < 0$ for all $x$ in the neighborhood $0 < \| x \| \leq \delta$, then the system is convergent.

The Lyapunov function can be considered to act as a sort of energy function that is minimized at equilibrium. Since energy is never added over time, the system will return to an equilibrium state. This often implies that to design a Lyapunov function it suffices to calculate the total energy of the system (kinetic + potential energy). In general, the design of a suitable Lyapunov function for a given dynamic system is somewhat of an art.

Orbits and Poincaré maps¶

In the case of systems with drift, in which drift cannot be cancelled out by choosing a suitable control, convergence to an equilibrium point is impossible. Examples of such systems include fixed-wing aircraft and bipedal walking. However, it may be possible to define a cyclic trajectory with favorable behavior, toward which all other trajectories converge (at least, when the starting point is sufficiently close to the trajectory).

A trajectory that returns to a previously encountered state $x(0)$ will cycle forever. Such a trajectory is known as periodic. An example would be a bipedal walking gait in which the same configurations and velocities are reached repeatedly after every two steps. Another would be a holding pattern in which an aircraft can repeat an unlimited number of times (or at least, until fuel runs out).

The image $\{ x \quad |\quad x=x(t)\text{ for some }t\} \in \mathbb{R}^n$ of a periodic trajectory $x(t)$ is known as an orbit. A stable orbit is an orbit such that for all starting points in a neighborhood of the orbit, the resulting trajectory never deviates too far from the orbit. An orbit is convergent if for all starting states in a neighborhood of the orbit, the resulting trajectory converges toward the orbit. A convergent orbit in 2D is known as a limit cycle.

A Poincaré map is method for studying the convergence of periodic trajectories, typically via simulation. The idea is to define an $n-1$-D region in state space (the Poincaré section) through which each trajectory passes through once per cycle. The points at which the trajectory passes through the section form a sequence of discrete states $x_1,x_2,x_3,...$. If those states converge to a fixed point, then the trajectory can be concluded to be stable. Often it is not possible to determine the sequence of states analytically, and hence this technique is most often applied with numerical simulation.

Further study¶

This chapter has only briefly touched on the field of control theory. Most universities with engineering schools offer courses on control theory, which provide an in-depth study of linear control systems. Nonlinear control theory will discuss chaotic dynamic systems, Lyapunov functions, and Poincaré maps in much more depth, as well as the control design techniques of feedback linearization and sliding mode control. The most widely used software for developing and analyzing control systems is Matlab Control Toolbox (Mathworks, Inc.).

Summary¶

Model-free control approaches do not require accurate knowledge of the equations of motion, but require more manual effort to tune. Model-based approaches can achieve higher performance, but are more complex and may require more online computation.
PID control is the workhorse of 1D control. Gain tuning is required to ensure stability and desired performance characteristics.
Applying PID directly to higher-dimensional systems can sometimes work adequately (industrial robots), but performance degrades when individual DOFs are coupled (vehicle control).
Simulation is widely used in the study of control stability and performance in response to disturbances.
LTI systems admit many analytic tools for control design, such as eigendecomposition and pole placement.
Analytical methods for control design in nonlinear systems include linearization, Lyapunov functions, and Poincaré maps.

Exercises¶

Analyze the conditions necessary for convergence of a second-order driftless LTI system $\ddot{x} = ax + b\dot{x} + cu$ under PI control. What conditions must be satisfied for the controller's gain constants to be overdamped, underdamped, and critically-damped?
Implement a PID controller for the pendulum swing-up system with $m=1 kg$, $L=1m$, $g = 9.8m/s^2$ and $u_{max} = 20N$. Using simulation, tune the gains to achieve low settling time, lower overshoot, and rare saturation of the control. (Don't forget to use the angular difference rather than simple distance!)
Why is there a relationship between the proportional gain and with control frequency $1/\Delta t$ in a PID controller? Analyze stability for an LTI system and determine the maximum gain that would not go unstable.
Using a simulation of the pendulum-swing up problem, empirically test the stability of a PID controller problem when the pendulum is near the vertical position $\theta \approx \pi/2$. Investigate how far the initial state can deviate from vertical, with the parameters $m=1 kg$, $L=1 m$, $g=9.8 m/s^2$, $k_P=10$, $k_D=2$, and $u_{max} = 5 N \cdot m$.