movingValid

Syntax

movingValid(func, funcArgs, window, [minPeriods], [combined=true])

Details

Constructs a sliding window based on the most recent window valid(non-null) values, and calls the aggregate function func on each window to compute a result.

If funcArgs contains an array vector or a columnar tuple, the function starts from the last element of this row and backtracks along the expanded data sequence, and then takes the most recent window non-null elements for calculation.

Parameters

func is an aggregate function.
Note:
When using this parameter, the keyword used to define the corresponding aggregation function is defg. For details, refer to Tutorial: User Defined Aggregate Functions.

funcArgs are the parameters of func . They can be vectors/array vectors/columnar tuples/dictionaries/tables/matrices. It is a tuple if there are more than one parameter of func , and all parameters must have the same size. If combined is true, the number of elements in the corresponding row of each parameter must be the same.

window is the moving window size.

minPeriods (optional) is a integer indicating the minimum number of observations in a window in order to generate a result. The default value is window .

combined (optional) is an Boolean value indicating how valid values are determined when there are more than one parameter of func .

  • true(default): After expansion, only parameter combinations where the values at the same position are all non-null are considered valid values.

  • false: Each parameter is traced independently, and the most recent window valid values are taken for calculation.

Returns

A vector with the same length as the input parameter.

Examples

Example 1: Calculate the moving effective average of a vector with a window of 3.

x = 1 2 NULL 3 4 NULL NULL NULL 5 6 
movingValid(func=avg, funcArgs=x, window=3, minPeriods=3)
// output:[,,,2,3,3,3,3,4,5]

Example 2: Calculate the myFactor at each moment based on the last 3 valid values from the array vector column.

defg myFactor(x){
    return (max(x)-min(x))*2
}
timeCol = 09:30:00 + 1..13
value = arrayVector(3 4 5 6 9 11 12 15 16 17 18 21 23,9 1 15 NULL NULL NULL 16 18 4 1 11 NULL 2 14 8 NULL NULL NULL 15 9 16 18 10)
t = table(timeCol, value)
timeCol value
09:30:01 [9, 1, 15]
09:30:02 [null]
09:30:03 [null]
09:30:04 [null]
09:30:05 [16, 18, 4]
09:30:06 [1, 11]
09:30:07 [null]
09:30:08 [2, 14, 8]
09:30:09 [null]
09:30:10 [null]
09:30:11 [null]
09:30:12 [15, 9, 16]
09:30:13 [18, 10]
select movingValid(myFactor,value,3,3) from t

movingValid_myFactor --- 28 28 28 28 28 20 20 24 24 24 24 14 16

Example 3: The impact of different minPeriods on the results:

val = 1 2 4 NULL 5 NULL 7 9 10
movingValid(func=min, funcArgs=val, window=3, minPeriods=3)
// output: [,,1,1,2,2,4,5,7]
movingValid(func=min, funcArgs=val, window=3, minPeriods=2)
// output: [,1,1,1,2,2,4,5,7]

Example 4: The impact of different valid data determination methods on results in multi-parameter scenarios:

When combined is true:

x = arrayVector(2 5 6 8 10 12, 10 20 30 40 NULL NULL 50 NULL NULL 60 70 80)
y = arrayVector(2 5 6 8 10 12, 1 NULL 2 3 NULL 5 1 NULL 2 7 8 5)
movingValid(func=covar, funcArgs=[x,y], window=2, minPeriods=2, combined=true)
x [10, 20] [30, 40, null] [null] [50, null] [null, 60] [70, 80]
y [1, null] [2, 3, null] [5] [1, null] [2, 7] [8, 5]
窗口计算 covar([30,40], [2,3]) covar([30,40], [2,3]) covar([40,50], [3,1]) covar([50,60],[1,7]) covar([70,80],[8,5])
结果 null 5 5 -10 30 -15

When combined is false:

x = arrayVector(2 5 6 8 10 12, 10 20 30 40 NULL NULL 50 NULL NULL 60 70 80)
y = arrayVector(2 5 6 8 10 12, 1 NULL 2 3 NULL 5 1 NULL 2 7 8 5)
movingValid(func=covar, funcArgs=[x,y], window=2, minPeriods=2, combined=false)
x

[10, 20]

[30, 40, null] [null] [50, null] [null, 60] [70, 80]
y [1, null] [2, 3, null] [5] [1, null] [2, 7] [8, 5]
窗口计算 covar([30,40], [2,3]) covar([30,40],[3,5]) covar([40,50],[5,1]) covar([50,60],[2,7]) covar([70,80],[8,5])
结果 null 5 10 -20 25 -15