gaussianKde

Syntax

gaussianKde(X,[weights],[bwMethod="scott"])

Arguments

X is a numeric vector, matrix, tuple, or table indicating the input dataset. Each row in X corresponds to a data point with consistent dimensions and a minimum of 2 elements (i.e., a data point must have at least 2 dimensions). The dataset must contain more rows than columns. Distributed tables are currently not supported.

weights (optional) is a numeric vector indicating the weight of each data point. By default, all data points are equally weighted. The values in weights must be non-negative and not all zeros. The length of weights must be the same as the number of rows in X.

bwMethod (optional) indicates the method for generating the bandwidth. It can be:

  • A STRING scalar, "scott" (default) or "silverman"

  • A numeric scalar indicating the bandwidth size

  • A function used to calculate the bandwidth based on X and return a numeric scalar.

Details

Estimate the probability density of the random variable using the Gaussian kernel from kernel density estimation (KDE).

The generated model can be used as the input for the gaussianKdePredict function.

Return value: A dictionary with the following keys:

  • X is a floating-point vector or matrix indicating the input dataset X.

  • cov is a floating-point matrix indicating the Cholesky decomposition of the covariance matrix generated from weights, X, and bandwidth.

  • weights is a floating-point vector indicating the corresponding weight of each data point.

  • predict is a function pointer indicating the corresponding prediction function. It is used with the syntax model.gaussianKdePredict(model, X). For details, see gaussianKdePredict.

  • bandwidth is a floating-point scalar indicating the generated bandwidth.

Examples

Estimate the probability density of the input file trainset.txt.

trainData = loadText("trainset.txt"," ");
model = gaussianKde(trainData)
model

Output:

X->
#0      #1     
0.1460  -0.1659
-1.3717 -1.6650
-1.6957 -1.1680
-0.7976 0.6081 
0.1088  2.5113 
-0.0724 -0.8210
-1.7548 -0.3485
1.1202  0.9004 
1.0234  0.7907 
-0.4256 0.7169 

predict->gaussianKdePredict
cov->
#0     #1    
0.7040 0.0   
0.4921 0.6700

weights->[0.1000,0.1000,0.1000,0.1000,0.1000,0.1000,0.1000,0.1000,0.1000,0.1000]
bandwidth->0.6812

gaussianKde can also be used with the gaussianKdePredict function to predict the probability density of another input file, testset.txt, as shown below.

testData = loadText("testset.txt"," ");
model.predict(testData)

Output:

->[0.0623,0.0730,0.0336,0.0030,0.0001,0.0552....]

Related Function: gaussianKdePredict