6.7.1. scikits.learn.hmm.GaussianHMM¶
- class scikits.learn.hmm.GaussianHMM(n_states=1, cvtype='diag', startprob=None, transmat=None, startprob_prior=None, transmat_prior=None, means_prior=None, means_weight=0, covars_prior=0.01, covars_weight=1)¶
Hidden Markov Model with Gaussian emissions
Representation of a hidden Markov model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a HMM.
See also
- GMM
- Gaussian mixture model
Examples
>>> from scikits.learn.hmm import GaussianHMM >>> GaussianHMM(n_states=2) GaussianHMM(cvtype='diag', n_states=2, means_weight=0, startprob_prior=1.0, startprob=array([ 0.5, 0.5]), transmat=array([[ 0.5, 0.5], [ 0.5, 0.5]]), transmat_prior=1.0, means_prior=None, covars_weight=1, covars_prior=0.01)
Attributes
cvtype Covariance type of the model. n_states Number of states in the model. transmat Matrix of transition probabilities. startprob Mixing startprob for each state. means Mean parameters for each state. covars Return covars as a full matrix. n_features int (read-only) Dimensionality of the Gaussian emissions. Methods
eval(X) Compute the log likelihood of X under the HMM. decode(X) Find most likely state sequence for each point in X using the Viterbi algorithm. rvs(n=1) Generate n samples from the HMM. init(X) Initialize HMM parameters from X. fit(X) Estimate HMM parameters from X using the Baum-Welch algorithm. predict(X) Like decode, find most likely state sequence corresponding to X. score(X) Compute the log likelihood of X under the model. - __init__(n_states=1, cvtype='diag', startprob=None, transmat=None, startprob_prior=None, transmat_prior=None, means_prior=None, means_weight=0, covars_prior=0.01, covars_weight=1)¶
Create a hidden Markov model with Gaussian emissions.
Initializes parameters such that every state has zero mean and identity covariance.
Parameters : n_states : int
Number of states.
cvtype : string
String describing the type of covariance parameters to use. Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’. Defaults to ‘diag’.
- covars¶
Return covars as a full matrix.
- cvtype¶
Covariance type of the model.
Must be one of ‘spherical’, ‘tied’, ‘diag’, ‘full’.
- decode(obs, maxrank=None, beamlogprob=-inf)¶
Find most likely state sequence corresponding to obs.
Uses the Viterbi algorithm.
Parameters : obs : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
maxrank : int
Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.
beamlogprob : float
Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.
Returns : viterbi_logprob : float
Log probability of the maximum likelihood path through the HMM
states : array_like, shape (n,)
Index of the most likely states for each observation
- eval(obs, maxrank=None, beamlogprob=-inf)¶
Compute the log probability under the model and compute posteriors
Implements rank and beam pruning in the forward-backward algorithm to speed up inference in large models.
Parameters : obs : array_like, shape (n, n_features)
Sequence of n_features-dimensional data points. Each row corresponds to a single point in the sequence.
maxrank : int
Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.
beamlogprob : float
Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.
Returns : logprob : array_like, shape (n,)
Log probabilities of the sequence obs
posteriors: array_like, shape (n, n_states) :
Posterior probabilities of each state for each observation
- fit(obs, n_iter=10, thresh=0.01, params='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', init_params='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', maxrank=None, beamlogprob=-inf, **kwargs)¶
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’. Likewise, if you would like just to do an initialization, call this method with n_iter=0.
Parameters : obs : list
List of array-like observation sequences (shape (n_i, n_features)).
n_iter : int, optional
Number of iterations to perform.
thresh : float, optional
Convergence threshold.
params : string, optional
Controls which parameters are updated in the training process. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars, etc. Defaults to all parameters.
init_params : string, optional
Controls which parameters are initialized prior to training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars, etc. Defaults to all parameters.
maxrank : int, optional
Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See “The HTK Book” for more details.
beamlogprob : float, optional
Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See “The HTK Book” for more details.
Notes
In general, logprob should be non-decreasing unless aggressive pruning is used. Decreasing logprob is generally a sign of overfitting (e.g. a covariance parameter getting too small). You can fix this by getting more training data, or decreasing covars_prior.
- means¶
Mean parameters for each state.
- n_states¶
Number of states in the model.
- predict(obs, **kwargs)¶
Find most likely state sequence corresponding to obs.
Parameters : obs : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
maxrank : int
Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.
beamlogprob : float
Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.
Returns : states : array_like, shape (n,)
Index of the most likely states for each observation
- predict_proba(obs, **kwargs)¶
Compute the posterior probability for each state in the model
Parameters : obs : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
See eval() for a list of accepted keyword arguments. :
Returns : T : array-like, shape (n, n_states)
Returns the probability of the sample for each state in the model.
- rvs(n=1)¶
Generate random samples from the model.
Parameters : n : int
Number of samples to generate.
Returns : obs : array_like, length n
List of samples
- score(obs, maxrank=None, beamlogprob=-inf)¶
Compute the log probability under the model.
Parameters : obs : array_like, shape (n, n_features)
Sequence of n_features-dimensional data points. Each row corresponds to a single data point.
maxrank : int
Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.
beamlogprob : float
Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.
Returns : logprob : array_like, shape (n,)
Log probabilities of each data point in obs
- startprob¶
Mixing startprob for each state.
- transmat¶
Matrix of transition probabilities.