From Coursera, State Estimation and Localization for Self-Driving Cars by University of Toronto
https://www.coursera.org/learn/state-estimation-localization-self-driving-cars
Linear and Nonlinear Kalman Filters
The (Linear) Kalman Filter
- The Kalman Filter requires the following motion and measurement models:
- Motion madel: $x_k=F_{k-1}x_{k-1}+G_{k-1}u_{k-1} + w_{k-1}$
- measurement model: $y_k = H_kx_k+v_k$
- with the following noise properties: $$v_k ~ N(0,R_k) \quad w_k ~ N(0,Q_k)$$
- $v_k$ is the measurement noise, $w_k$ is the process/motion noise
The Kalman filter can be regarded as a recursive least squares estimator that also includes a motion model.
It has twp stages: the prediction and the
- Prediction: $$\check x_k = F_{k-1}x_{k-1}+G_{k-1}u_{k-1}$$ $$\check P_k = F_{k-1}\hat P_{k-1}\hat F^T_{k-1} + Q_{k-1}$$
- Optimal gain: $$K_k = \check P_k H^T_k(H_k\check P_k H^T_k + R_k)^{-1}$$
- Correction: $$\hat x_k = \check x_k + K_k(y_k-H_k\check x_k)$$ $$\hat P_k = (1-K_kH_k)\check P_k$$
Summary:
- The Kalman Filter is very similar to RLS but includes a motion model that tells us how the state evolves over time
- The Kalman Filter updates a state estimate through two stages:
- prediction using the motion model
- correction using the measurement model
Kalman Filter and The Bias BLUEs
Bias
- Bias = $E(\hat p_k) - p_k$ ,where pk is the true position
- A filter is an unbiased if for all k : $E[\hat e_k] = E[\hat p_k -p_k] = E[\hat p_k] - p_k = 0$
- How to compute bias analytically?
- consider the error dynamics:
- predicted state error: $\check e_k = \check x_k - x_k$
- corrected estimate error: $\hat e_k = \hat x_k - x_k$
- using Kalman Filter equation we can derive: $$\check e_k = F_{k-1}\check e_{k-1} - w_k$$ $$\hat e_k = (1-K_kH_k)\check e_k + K_kv_k$$
- For Kalman filter, for all k: $$E[\check e_k] = E[F_{k-1}\check e_{k-1} - w_k] = F_{k-1} E[\check e_{k-1} - E[w_k] = 0$$ $$E[\hat e_k] = E[(1-K_kH_k)\check e_k +K_kv_k] = (1-K_kH_k)E[\check e_k]+K_kE[v_k]=0 $$
- unbiased prediction!
- so long as $E[\hat e_0]=0, E[v]=0, E[w]=0 $, which is the white uncorrelated noise
consistency
- the filter is consistent if for all k: $E[\hat e^2_k]=E[(\hat p_k-p_k)^2]=\hat P_k $
- that is to say: for all time steps k, the filter co-variants $P_k$ matches the expected value of the square of our error.
- Practically, this means that our filter is neither overconfident, nor underconfident in the estimate it has produced:
- A filter that is overconfident, and hence inconsistent, will report a covariance that is optimistic: the filter will essentially place too much emphasis on its own estimate and will be less sensitive to future measurement updates.
- So long as the initial estimate is consisten and we have white zero noise, then all the estimates will be consistent
Kalman filter is the Best Linear Unbiased
- given white uncorrelated zero mean noise, the Kalman Filter is unbiased and consistent.
- We can also say that the filter is consistent
- In general,if we have white,uncorrelated zero-mean noise,the Kalman filter is the best(i.e,lowest variance)unbiased estimator that uses only a linear combination of measurements
The Extended Kalman Filter
EKF: Uses first-order linearization to turn a nonlinear problem into a linear one
- As we introduced, the Kalman filter is actually the best of all possible estimators for linear systems
- However, linear systems don’t exist in reality.
- Therefore, we need to use a kind of Kalman filter that can apply to non-linear system. That is the extended kalman filter
- The key concept in the Extended Kalman Filter is the idea of linearizing a nonlinear system
Role of Jacobian matrices in the EKF and how to compute them
Linearizing a nonlinear system
- Linearization means to choose an operating point $a$ and finding a linear approximation to the nonlinear function in the neighborhood of $a$
- in two dimensions, this means finding the tangent line to the function f of x when x equals a.
- Mathematically, we do this by taking the Taylor series expansion of the function: $$f(x)\approx f(a) + \frac{\partial f(x)}{\partial x}|_{x=a}(x-a) + \frac{1}{2!}\frac{\partial^2f(x)}{\partial x^2}|{x=a}(x-a)^2 + … $$
- For linearization, we’re only interested in the first order terms of the Taylor series expansion.
Pick the most recent state estimate as the operating point:
- we can get the linearized motion and measurment model
- the model involves Jacobian Matrices:
- In vector calculus,a Jacobian matrix is the matrix of all first-order partial derivatives of a vector-valued function
- Intuitively,the Jacobian matrix tells you how fast each output of the function is changing along each input dimension
Apply the EKF to a simple nonlinear tracking problem
The Error State Extended Kalman Filter (ES-EKF)
Error-state formulation of the Extended Kalman Filter
- We can think of the vehicle state as composed of two parts: a large part called the nominal state, $\hat x$, and a small part called the error state, $\delta x$: $x=\hat x+\delta x$
- We can think of the error state as the place where all of these modelling errors and process noise accumulate over time, so that the error state is just the difference between the nominal state and the true state at any given time.
- If we can figure out what the error state is, we can actually use it as a correction to the nominal state to bring us closer to the true state.
- So in the ES-EKF, instead of doing Kalman filtering on the full state which might have lots of complicated non-linear behaviors, we’re going to use the EKF to estimate the error state instead, and then use the estimate of the error state as a correction to the nominal state
- Mathematically, we’re going to rearrange our linearized motion model so that we now have an equation that can tell us how the difference between the true state at time, k, and our predicted state at time, k, is related to the same difference at time, $k-1$.
- that means we can build the equations called the error state kinematics.
Loop:
- update the nominal state using the non-linear motion model and the current best estimate of the state.
- keep track of the state covariance, which grows as we integrate more and more process noise from the motion model.
- repeat the loop updating the nominal state and the error state covariance for as long as we like until we receive the measurement and want to do a correction.
- then compute the Kalman gain
- compute the best estimate of the error state using the Kalman gain, the measurement, and our nonlinear measurement model
- update the nomial state by just adding our estimate of the error state to the nominal state to get the correct state estimate for the full state
- finally update the state covariance using the usual equations.
Advantages of the Error-state EKF over the vanilla EKF:
- Better performance compared to the vanilla EKF
- The”small”error state is more amenable to linear filtering than the “large” nominal state, which can be integrated nonlinearly
- Easy to work with constrained quantities(e.g., rotations in 3D)
- We can also break down the state using a generalized composition operator
Limitations of the EKF
Linearization error
- The EKF works by linearizing the nonlinear motion and measurement models to update the mean and covariance of the state
- The difference between the linear approximation and the nonlinear function is called linearization error
- In general, the linearization error depends on two things:
- how non-linear the original function is to begin with. If our nonlinear function very slowly or is quite flat much of the time, linear approximation is going to be a pretty good fit.
- how far away from the operating point the linear approximation is being used. The further away you move from the operating point, the more likely the linear approximation is to diverge from the true function.
Computing Jacobians
- Analytical differentiation is prone to human eror
- Numerical differentiation can be slow and unstable
- Automatic differentiation (e.g., at compile time) can also behave unpredictably
An Alternative to the EKF - The Unscented Kalman Filter
- an alternative approach to non-linear common filtering that relies on something called the unscented transform to pass probability distributions through nonlinear functions.
Use the Unscented transform to pass a probability distribution through a nonlinear function
- The basic idea in the unscented transform has three steps.
- First, we choose a set of sample points from our input distribution. These aren’t random samples, but deterministic samples chosen to be a certain number of standard deviations away from the mean.
- these samples are called sigma points, and the unscented transform is sometimes called the sigma point transform.
- After getting the sigma points, pass each sigma point through the nonlinear function, producing a new set of sigma points belonging to the output distribution.
- Finally, compute the sample mean and covariance of the output sigma points with some carefully chosen weights, and these will give us a good approximation of the mean and covariance of the true output distribution.
- The unscented transform - Choosing Sigma Points
- In general, for an $n$ dimensional probability distribution, we need $2n+1$ sigma points, one for the mean and the rest symmetrically distributed about the mean.
- The first step in determining where the sigma point should be is taking something called the Cholesky decomposition of the covariance matrix associated with the input distribution: $LL^T=\Sigma_{xx}$ (L is the lower triangular)
- In fact, if the input PDF is one dimensional the Cholesky decomposition is just the square root of the variants, which is the standard deviation.
- The unscented transform - Transforming nad recombing
- pass each of the sigma points through nonlinear function to get a new set of transformed sigma points.
- And finally compute the mean and covariance of the output PDF: each of the points gets a specific weight in the mean and covariance calculations, and that weight depends on the parameter kappa and the dimension of the input distribution N.
How the unscented Kalman filter (UKF)uses the Unscented transform in the prediction and correction steps
- We can easily use the Unscented Transform in our Kalman Filtering framework with nonlinear models;
- Prediction step:
- To propagate the state from time (k-1) to time k, apply the Unscented Transform using the current best guess for the mean and covariance
- decompose the estimated state covariance from time k- 1 and compute sigma point centered around the estimated means state from time k- 1
- propagate our sigma points through our nonlinear motion model to get a new set of sigma points for the predicted state at time k.
- finally, calculate the predicted mean and covariance for the state at time K. At this point it’s important to account for the process noise by adding its covariance to the covariance of the transformed sigma points to get the final predicted covariance.
- Correction step
- To correct the state estimate using measurements at time k,use the nonlinear measurement model and the sigma points from the prediction step to predict the measurements
- First, redraw our sigma points using the predicted covariance matrix. We need to do this a second time because we added process noise at the end of the last step, and this will modify the positions of some of the sigma points.
- plug these new sigma points one by one into our nonlinear measurement model to get another set of sigma points for the predicted measurements, then we can estimate the mean and covariance of the predicted measurements using the sample mean and covariance formulas.
- To compute the common gain, we’re also going to need the cross covariance between the predicted state and the predicted measurements, which tells us how the measurements are correlated with the state.
- Then use the Kalman gain to optimally correct the mean and covariance of the predicted state
Advantages of the UKF over the EKF
- recommand to use UKF
- address the limitations of EKF
Apply the UKF to a simple nonlinear tracking problem