gmm#

swordfish.function.gmm()#

Train the Gaussian Mixture Model (GMM) with the given data set.

Parameters:
  • X (Constant) – The training data set. For univariate data, X is a vector; For multivariate data, X is a matrix/table where each column is a sample.

  • k (Constant) – An integer indicating the number of independent Gaussians in a mixture model.

  • maxlter (Constant, optional) – A positive integer indicating the maximum EM iterations to perform. The default value is 300.

  • tolerance (Constant, optional) – A floating-point number indicating the convergence tolerance. EM iterations will stop when the lower bound average gain is below this threshold. The default value is 1e-4.

  • randomSeed (Constant, optional) – The random seed given to the method.

  • mean (Constant, optional) –

    A vector or matrix indicating the initial means.

    • For univariate data, it is a vector of length k;

    • For multivariate data, it is a matrix whose number of columns is k and number of rows is the same as the number of variables in X;

    • If mean is unspecified, k values are randomly selected from X as the initial means.

  • sigma (Constant, optional) –

    Can be:

    • a vector, indicating the initialized variance of each submodel if X is univariate data;

    • a tuple of length k, indicating the covariance matrix of each submodel if X is multivariate data;

    • a vector with element values of 1 or an identity matrix if sigma is unspecified.

Returns:

A dictionary with the following keys:

  • modelName: a string “Gaussian Mixture Model”

  • prior: the prior probability of each submodel

  • mean: the expectation of each submodel

  • sigma: If X is univariate data, it represents the variance of each submodel; If X is multivariate data, it represents the covariance matrix of each submodel.

Return type:

Constant