Hidden Markov Model - Multivariate Autoregressive model

2019-03-11

Moedel description

The HMM-MAR has two key ingredients:

the Hidden Markov Model (HMM):
- An HMM describes a time series as a sequence of states, where each state has its own model of the observed data (i.e., the observation model) and the probability of being at a given state at each time point depends on the states assignment in the previous time point.
- The transition probabilities (how likely is to be in state j if we were in state i just before) are also estimated by the model.
and the Multivariate Autoregressive model (MAR):
- the observation model corresponds to a MAR model
- characterises the behaviour of time series by linear historical interactions between the observed time series from different brain regions.
- In different words, a MAR model aims to predict the multichannel signal value at each time point as a linear combination of signal values at previous time points.
- MARs are able to characterise the frequency structure of the data, and by making the model multivariate, are able to capture interactions (e.g., coherence) between multiple brain regions.
- Therefore, each state is related to a different set of multi-region autoregression coefficients describing the neural oscillations.
- When the order of the MAR model is set to zero, the states are modelled using Gaussian distributions, in which case the segmentation will be based on instantaneous activation and connectivity.
use variational Bayes inference

MAR parametrisation

order parameter P: defines how far we go back in time to predict each time point.
When the number of regions is high, it is easy that for the MAR to overfit the data.
- That is, a relatively simple MAR model can explain a lot of variance, which will induce the HMM-MAR inference process to drop most of the states of the model by letting a few (or even one) dominant states to control the entire time series, thus hindering the discovery of quasi-stationary connectivity networks.
To solve this:
- To use a low order, for example 3, and, if needed, increase that value and see how the segmentation changes.
- To work on a low-dimensional version of the data using PCA (see below).
- To avoid using the cross-channel terms, such that each state is characterised by their spectra but not the cross-spectra

Model selection

In order to determine the number of HMM states, K, a possible strategy is to fix the maximum number, and then the Bayesian inference process can discard some of them if there is insufficient evidence to support their presence in the data.
The free energy, a quantity that approximates the model evidence, can also be used to choose K.

Spectral description of the states

MAR corresponds with the frequency domain
Hence the state segmentation is driven by the spectral content of each separate channel and the spectral relations between channels.
Once the HMM-MAR model is trained and we have the state time courses, we can estimate the spectral content of the model, including the following measures:
- Power spectral density (PSD)
- Coherence
- Partial directed coherence (PDC)
- Phase
The estimation of these quantities can be done by one of these two methods:
- A parametric estimation, using the MAR coefficients.
- A nonparametric estimation, using a novel statewise version of the multitaper. This would be the way to do it for example if the states were specified to be Gaussian distributions (i.e. MAR order equal to zero).
In the parametric case
- we can use the MAR coefficients as returned by the model.
- These coefficients correspond to a maximum a posteriori estimation (which means that they have been regularised by the prior), and, as described above, can have an incomplete parametrisation.
- For the sake of spectral estimation, we prefer to re-estimate the MAR coefficients using maximum likelihood, as is standard in the signal processing literature, and use a complete parametrisation, i.e. using lags (t-1,t-2,…,t-P).
In the non-parametric case:
- take advantage of the inferred HMM state time courses as temporal windows for estimating the state-specific spectral properties using a weighted version of the multitaper. PDC has not a closed solution in the non-parametric case, so that we need to use the so-called Wilson factorisation

Analysis of results

The mean pattern of activity of each state. The function getMean(hmm,k) outputs the mean of the Gaussian distribution, assuming that the option zeromean was set to 0. Note that this is mostly useful for the Gaussian distribution, but not that much for the MAR model.
The functional connectivity of each state. The function getFuncConn(hmm,k) outputs the covariance and correlation matrices of state k, when the states are distributed Gaussian (order=0). In the case of a MAR state model, these refer to the covariance matrix of the residual.
The transition probabilities. The function getTransProbs(hmm) returns the transition probabilies from any state to any other state, without considering the persistence probabilities (i.e. the probability to remain in the same state). The transition probability matrix including the persistence probabilities is contained in hmm.P.
The transition probabilities for a subset of the subjects. The function getMaskedTransProbMats returns the transition probabilities (including persistencies) for subsets of the data as specified in the parameter Masks. This is useful for example if we wish to find differences in the state transitions between groups.
The state time fractional occupancies, using getFractionalOccupancy(Gamma,T,dim). This can refer to either (i) how much time the HMM spends on each state at each time point on average (across trials), or (ii) how much time each subject/trial/session spends in each state (i.e. the average state probability across time, per session or subject). The former, useful for task, is computed when dim=1; the latter, useful to investigate differences in occupancies between subjects, is computed when dim=2.
The state switching rates. The function getSwitchingRate provides a measure of the state switching rate for each session/subject, and can be understood as a measure of stability per subject.
The state life times. The function getStateLifeTimes returns, per subject/session/trial, a vector of life times per state, containing the number of time points per visit. This reflects the temporal stability of the states.
The state interval times. The function getStateIntervalTimes returns, per subject/session/trial, a vector of interval times per state, containing the number of time points between visits. The interval times, the life times, the fractional occupancies and the switching rates are sometimes referred to collectively as the “chronnectome”.
maximum fractional occupancy, used to measure whether the HMM is effectively capturing the dynamics on the data. The function getMaxFractionalOccupancy finds the maximum fractional occupancy for each subject/trial. This is a useful statistic to diagnose whether the HMM solution is “mixing well” or states are assigned to describe entire subjects (in which case the HMM is doing a poor job in finding the data dynamics).
The onset time of the states for each session. The function getStateOnsets provides a vector per session and state of when specifically the state becomes active. That can be used to relate to task information.
The evoked state probability relative to a stimulus. The function evokedStateProbability() can be used to examine how the stimulus modulates the state probabilities.
The similarity between two HMM runs. There are multiple ways to measure the similarity between HMM runs (on the same or comparable data). One of them is to measure the statistical dependence between the state time courses between the two runs. This is done with the function getGammaSimilarity(gamma1,gamma2), which measures the overlapping between the state probabilities after optimally reordering the states.

Reference:
https://github.com/OHBA-analysis/HMM-MAR/wiki/User-Guide