Time-Based Moving TopN Functions (tmTopN functions)

DolphinDB provides time-based moving topN functions (tmTopN) functions to perform calculations on the top N elements in a time-based sliding window.

Introduction

Syntax templates for tmTopN functions:

tmTopN(T, X, S, window, top, [ascending=true], [tiesMethod='latest'])
tmTopN(T, X, Y, S, window, top, [ascending=true], [tiesMethod='latest'])

Parameters:

T is a non-strictly increasing vector of temporal or integral type.

X (Y) is a numeric vector or matrix.

S is a numeric/temporal vector or matrix, based on which X are sorted. NULL values in S are ignored.

window is an integer greater than 1, indicating the sliding window size.

top is a positive integer or a floating-point number in (0,1) that indicates the number of top-ranked elements of X after sorted based on S.
  • If top is an integer, the first top elements are obtained.

  • If top is a floating-point number, it represents the percentage of top-ranked elements. The number of top-ranked elements is max(1, floor(window size*top)). This takes the rounded-down result of multiplying the number of elements in a window by top percentage, and returns the maximum of either 1 or this value.

ascending (optional) is a Boolean value indicating whether to sort S in ascending order. The default value is true.

tiesMethod (optional) is a string that specifies how to select elements if there are more elements with the same value than spots available in the top N after sorting X within a sliding window. It can be:
  • 'oldest': select elements starting from the earliest entry into the window;

  • 'latest': select elements starting from the latest entry into the window;

  • 'all': select all elements.

Windowing Logic

For the tmTopN functions, window can be of integral or DURATION type. The window size is measured by time. For each element Ti in T, the range of window is (temporalAdd(Ti, -window), Ti].

Within a sliding window of given length (measured by the time), the function stably sorts X (or X, Y) by S in the order specified by ascending, then obtains the first top elements for calculation.

The following example illustrates the calculation rules:

T=13:30m 13:34m 13:36m 13:37m 13:38m
S = 5 8 1 9 7
X = 2 1 5 3 4
tmsumTopN(T, X, S, window=4, top=3)
// output
[2,1,6,9,12]

For the first top windows, all the elements are taken for calculation. Therefore, the figure illustrates the rules starting from the top + 1 window.

The following examples show the usage of parameter tiesMethod:

T=2021.01.03+1..7
X = [2, 1, 4, 3, 4, 3, 1]
S = [5, 8, 1, 1, 1, 1, 3]
// For the second last window, there are four elements of value 1
// As tiesMethod is not specified, the default 'latest' is used, meaning the latest 3 occurrences of 1 (corresponding to 3, 4, 1 of X) are selected
tmsumTopN(T,X,S,4,3)
// output
[2,3,7,9,11,10,10]

// As tiesMethod is set to 'oldest', the first 3 occurrences of 1 (corresponding to 4, 3, 4 of X) are selected
tmsumTopN(T,X,S,4,3,tiesMethod=`oldest)
// output
[2,3,7,9,11,11,10]

// As tiesMethod is set to 'all', all the occurrences of 1 (corresponding to 4, 3, 4, 1 of X) are selected
tmsumTopN(T,X,S,4,3,tiesMethod=`all)
// output
[2,3,7,9,11,14,10]