piecewiseLinFit
Syntax
piecewiseLinFit(X, Y, numSegments, [XC], [YC], [bounds],
[lapackDriver='gelsd'], [degree=1], [weights], [method='de'], [maxIter],
[initialGuess], [seed])
Arguments
X is a numeric vector indicating the data point locations of x. NULL value is not allowed.
Y is a numeric vector indicating the data point locations of y. NULL value is not allowed.
numSegments is a positive integer indicating the desired number of line segments.
XC (optional) is a numeric vector indicating the x locations of the data points that the piecewise linear function will be forced to go through. It only takes effect when method='de'.
YC (optional) is a numeric vector indicating the y locations of the data points that the piecewise linear function will be forced to go through. It only takes effect when method='de'.
bounds (optional) is a numeric matrix of shape (numSegments-1, 2), indicating the bounds for each breakpoint location within the optimization.
lapackDriver (optional) is a string indicating which LAPACK driver is used to solve the least-squares problem. It can be 'gelsd' (default), 'gelsy' and 'gelss'.
degree (optional) is a non-negative integer indicating the degree of polynomial to use. The default is 1 for linear models. Use 0 for constant models.
weights (optional) is a numeric vector indicating the weights used in least-squares algorithms. The individual weights are typically the reciprocal of the standard deviation for each data point, where weights[i] corresponds to one over the standard deviation of the ith data point. NULL value is not allowed.
method (optional) is a string indicating the model used. It can be:
- 'nm' (default): Nelder-Mead simplex algorithm.
- 'bfgs': BFGS algorithm.
- 'lbfgs': LBFGS algorithm.
- 'slsqp': Sequential Least Squares Programming algorithm.
- 'de': Differential Evolution algorithm.
maxIter (optional) is an integral scalar or vector indicating the maximum number of iterations for the optimization algorithm during the fitting process.
initialGuess (optional) is a numeric vector indicating the initial guess for the parameters that optimize the function. Its length is numSegments-1.
seed (optional) is an integer indicating the random number seed used in the differential evolution algorithm to ensure the reproducibility of results. It only takes effect when method='de' or initialGuess is NULL. If not specified, a non-deterministic random number generator is used.
Details
Fit a continuous piecewise linear function for a specified number of line segments. Use differential evolution to find the optimal location of breakpoints for a given number of line segments by minimizing the sum of the square error. Note: Due to the randomness of the differential evolution, the results of this function may vary slightly each time.
The fitted model can be used as an input for function
pwlfPredict
.
Return value: A dictionary with the following keys:
-
breaks: A floating-point vector indicating the breakpoint locations.
-
beta: A floating-point vector indicating the beta parameter for the linear fit.
-
xData: A floating-point vector indicating the input data point locations of x.
-
yData: A floating-point vector indicating the input data point locations of y.
-
XC: A floating-point vector indicating the x locations of the data points that the piecewise linear function will be forced to go through.
-
YC: A floating-point vector indicating the y locations of the data points that the piecewise linear function will be forced to go through.
-
weights: A floating-point vector indicating the weights used in least-squares algorithms.
-
degree: A non-negative integer indicating the degree of polynomial.
-
lapackDriver: A string indicating the LAPACK driver used to solve the least-squares problem.
-
numParameters: An integer indicating the number of parameters.
-
predict: The function used for prediction. The method is called by
model.predict(X, [beta], [breaks])
. See pwlfPredict. -
modelName: A string "Piecewise Linear Regression" indicating the model name.
Examples
def linspace(start, end, num, endpoint=true){
if(endpoint) return end$DOUBLE\(num-1), start + end$DOUBLE\(num-1)*0..(num-1)
else return start + end$DOUBLE\(num-1)*0..(num-1)
}
X = linspace(0.0, 1.0, 10)[1]
Y = [0.41703981, 0.80028691, 0.12593987, 0.58373723, 0.77572962, 0.41156172, 0.72300284, 0.32559528, 0.21812564, 0.41776427]
model = piecewiseLinFit(X, Y, 3)
model;
Output:
breaks->[0.0,0.258454644769,0.366954310101,1.000000000000]
numParameters->4
degree->1
xData->[0.0,0.111111111111,0.222222222222,0.333333333333,0.444444444444,0.555555555555,0.666666666666,0.777777777777,0.888888888888,1.000000000000]
predict->pwlfPredict
yData->[0.417039810000,0.800286910000,0.125939870000,0.583737230000,0.775729620000,0.411561720000,0.723002840000,0.325595280000,0.218125640000,0.417764270000]
yC->
xC->
weights->
beta->[0.593305500750,-1.309949743583,5.703647584013,-5.105351630664]
lapackDriver->gelsd
piecewiseLinFit
can be used with pwlfPredict
for
predication based on the model:
xHat = linspace(0.0, 1.0, 20)[1]
model.predict(xHat)
// Output: [0.593305499919518 0.524360777381737 0.455416054843957 0.386471332306177 0.317526609768396 0.368043438179296 0.529813781212159 0.691584124245021 0.69295837868457 0.655502915538459 0.618047452392347 0.580591989246236 0.543136526100125 0.505681062954014 0.468225599807903 0.430770136661792 0.393314673515681 0.35585921036957 0.318403747223459 0.280948284077348]
Related function: pwlfPredict