Contents

scikits.learn.hmm.MultinomialHMM

class scikits.learn.hmm.MultinomialHMM(n_states=1, nsymbols=1, startprob=None, transmat=None, startprob_prior=None, transmat_prior=None, labels=None, emissionprob=None)

Hidden Markov Model with multinomial (discrete) emissions

See also

GaussianHMM
HMM with Gaussian emissions

Examples

>>> from scikits.learn.hmm import MultinomialHMM
>>> MultinomialHMM(n_states=2, nsymbols=3) 
MultinomialHMM(n_states=2,
        emissionprob=array([[ ...],
       [ ...]]),
        labels=[None, None], startprob_prior=1.0,
        startprob=array([ 0.5,  0.5]),
        transmat=array([[ 0.5,  0.5],
       [ 0.5,  0.5]]), nsymbols=3,
        transmat_prior=1.0)

Attributes

n_states
nsymbols
transmat
startprob

Methods

eval(X) Compute the log likelihood of X under the HMM.
decode(X) Find most likely state sequence for each point in X using the Viterbi algorithm.
rvs(n=1) Generate n samples from the HMM.
init(X) Initialize HMM parameters from X.
fit(X) Estimate HMM parameters from X using the Baum-Welch algorithm.
predict(X) Like decode, find most likely state sequence corresponding to X.
score(X) Compute the log likelihood of X under the model.
__init__(n_states=1, nsymbols=1, startprob=None, transmat=None, startprob_prior=None, transmat_prior=None, labels=None, emissionprob=None)

Create a hidden Markov model with multinomial emissions.

Parameters :

n_states : int

Number of states.

decode(obs, maxrank=None, beamlogprob=-inf)

Find most likely state sequence corresponding to obs.

Uses the Viterbi algorithm.

Parameters :

obs : array_like, shape (n, n_dim)

List of n_dim-dimensional data points. Each row corresponds to a single data point.

maxrank : int

Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.

beamlogprob : float

Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.

Returns :

viterbi_logprob : float

Log probability of the maximum likelihood path through the HMM

states : array_like, shape (n,)

Index of the most likely states for each observation

See also

eval
Compute the log probability under the model and posteriors
score
Compute the log probability under the model
emissionprob

Emission probability distribution for each state.

eval(obs, maxrank=None, beamlogprob=-inf)

Compute the log probability under the model and compute posteriors

Implements rank and beam pruning in the forward-backward algorithm to speed up inference in large models.

Parameters :

obs : array_like, shape (n, n_dim)

Sequence of n_dim-dimensional data points. Each row corresponds to a single point in the sequence.

maxrank : int

Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.

beamlogprob : float

Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.

Returns :

logprob : array_like, shape (n,)

Log probabilities of the sequence obs

posteriors: array_like, shape (n, n_states) :

Posterior probabilities of each state for each observation

See also

score
Compute the log probability under the model
decode
Find most likely state sequence corresponding to a obs
fit(obs, n_iter=10, thresh=0.01, params='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', init_params='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', maxrank=None, beamlogprob=-inf, **kwargs)

Estimate model parameters.

An initialization step is performed before entering the EM algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’. Likewise, if you would like just to do an initialization, call this method with n_iter=0.

Parameters :

obs : list

List of array-like observation sequences (shape (n_i, n_dim)).

n_iter : int, optional

Number of iterations to perform.

thresh : float, optional

Convergence threshold.

params : string, optional

Controls which parameters are updated in the training process. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars, etc. Defaults to all parameters.

init_params : string, optional

Controls which parameters are initialized prior to training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars, etc. Defaults to all parameters.

maxrank : int, optional

Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See “The HTK Book” for more details.

beamlogprob : float, optional

Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See “The HTK Book” for more details.

Notes

In general, logprob should be non-decreasing unless aggressive pruning is used. Decreasing logprob is generally a sign of overfitting (e.g. a covariance parameter getting too small). You can fix this by getting more training data, or decreasing covars_prior.

n_states

Number of states in the model.

predict(obs, **kwargs)

Find most likely state sequence corresponding to obs.

Parameters :

obs : array_like, shape (n, n_dim)

List of n_dim-dimensional data points. Each row corresponds to a single data point.

maxrank : int

Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.

beamlogprob : float

Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.

Returns :

states : array_like, shape (n,)

Index of the most likely states for each observation

rvs(n=1)

Generate random samples from the model.

Parameters :

n : int

Number of samples to generate.

Returns :

obs : array_like, length n

List of samples

score(obs, maxrank=None, beamlogprob=-inf)

Compute the log probability under the model.

Parameters :

obs : array_like, shape (n, n_dim)

Sequence of n_dim-dimensional data points. Each row corresponds to a single data point.

maxrank : int

Maximum rank to evaluate for rank pruning. If not None, only consider the top maxrank states in the inner sum of the forward algorithm recursion. Defaults to None (no rank pruning). See The HTK Book for more details.

beamlogprob : float

Width of the beam-pruning beam in log-probability units. Defaults to -numpy.Inf (no beam pruning). See The HTK Book for more details.

Returns :

logprob : array_like, shape (n,)

Log probabilities of each data point in obs

See also

eval
Compute the log probability under the model and posteriors
decode
Find most likely state sequence corresponding to a obs
startprob

Mixing startprob for each state.

transmat

Matrix of transition probabilities.