resample

Syntax

resample(X, rule, func, [closed], [label], [origin='start_day'])

Parameters

X is a matrix or series with row labels. The row labels must be non-null values of temporal type, and must be increasing.

rule is a string that can take the following values:

Values of parameter "rule" Corresponding DolphinDB function
"B" businessDay
"W" weekEnd
"WOM" weekOfMonth
"LWOM" lastWeekOfMonth
"M" monthEnd
"MS" monthBegin
"BM" businessMonthEnd
"BMS" businessMonthBegin
"SM" semiMonthEnd
"SMS" semiMonthBegin
"Q" quarterEnd
"QS" quarterBegin
"BQ" businessQuarterEnd
"BQS" businessQuarterBegin
"REQ" FY5253Quarter
"A" yearEnd
"AS" yearBegin
"BA" businessYearEnd
"BAS" businessYearBegin
"RE" FY5253
"D" date
"H" hourOfDay
"min" minuteOfHour
"S" secondOfMinute
"L" millisecond
"U" microsecond
"N" nanosecond
"SA" semiannualEnd
"SAS" semiannualBegin

The strings above can also be used with positive integers for parameter rule. For example, "2M" means the end of every two months. In addition, rule can also be set as the identifier of the trading calendar, e.g., the Market Identifier Code of an exchange, or a user-defined calendar name. Positive integers can also be used with identifiers. For example, "2XNYS" means every two trading days of New York Stock Exchange.

func is an aggregate function.

closed (optional) is a string indicating which boundary of the interval is closed.
  • The default value is 'left' for all values of rule except for 'M', 'A', 'Q', 'BM', 'BA', 'BQ', and 'W' which all have a default of 'right'.

  • The default is 'right' if origin is 'end' or 'end_day'.

label (optional) is a string indicating which boundary is used to label the interval.
  • The default value is 'left' for all values of rule except for 'M', 'A', 'Q', 'BM', 'BA', 'BQ', and 'W' which all have a default of 'right'.

  • The default is 'right' if origin is 'end' or 'end_day'.

origin (optional) is a string or a scalar of the same data type as X, indicating the timestamp where the intervals start. It can be 'epoch', start', 'start_day', 'end', 'end_day' or a user-defined time object. The default value is 'start_day'.
  • 'epoch': origin is 1970-01-01

  • 'start': origin is the first value of the timeseries

  • 'start_day': origin is 00:00 of the first day of the timeseries

  • 'end': origin is the last value of the timeseries

  • 'end_day': origin is 24:00 of the last day of the timeseries

Details

Apply func to X based on the frenquency (or the trading calendar) as specified in rule. Note that when rule is specified as the identifier of the trading calendar, data generated on a non-trading day will be calculated in the previous trading day.

Both DolphinDB’s resample and pandas’ resample are used for time series resampling and bucket-based aggregation, but their interface design and usage differ in several aspects:

  • DolphinDB’s resample adopts a functional interface, where the aggregation function is passed directly as an argument. The input object must be a matrix or series with monotonically increasing time-based row labels, and the function supports resampling rules based on built-in trading calendars.
  • pandas’ resample returns a Resampler object and is typically used in a chained style such as df.resample("3min").sum() or .agg(). The input can use a DatetimeIndex, PeriodIndex, or TimedeltaIndex, and a time column can also be specified through the on or level parameter. This interface is more suitable for DataFrame-oriented workflows and natively supports post-resampling filling operations such as asfreq, ffill, bfill, and interpolate.

Examples

index = [2000.01.01, 2000.01.31, 2000.02.15, 2000.02.20, 2000.03.12, 2000.04.16, 2000.05.06, 2000.08.30]
s = indexedSeries(index, 1..8)
s.resample("M", sum);
col1
2000.01.31 3
2000.02.29 7
2000.03.31 5
2000.04.30 6
2000.05.31 7
2000.06.30
2000.07.31
2000.08.31 8
s.resample("2M", last);
col1
2000.01.31 2
2000.03.31 5
2000.05.31 7
2000.07.31
2000.09.30 8
index = temporalAdd(2022.01.01 00:00:00,1..8,`m)
s = indexedSeries(index, 1..8)
s.resample(rule=`3min, func=sum);
label col1
2022.01.01T00:00:00 3
2022.01.01T00:03:00 12
2022.01.01T00:06:00 21
s.resample(rule=`3min, func=sum, closed=`right);
label col1
2022.01.01T00:00:00 6
2022.01.01T00:03:00 15
2022.01.01T00:06:00 15
s.resample(rule=`3min, func=sum, closed=`left,origin=`end);
label col1
2022.01.01T00:02:00 1
2022.01.01T00:05:00 9
2022.01.01T00:08:00 18
2022.01.01T00:11:00 8
s.resample(rule=`3min, func=sum,origin=2022.10.01 00:00:10)
label col1
2022.01.01T00:00:10 6
2022.01.01T00:03:10 15
2022.01.01T00:06:10 15

A matrix with increasing row labels can be specified.

m = matrix(1..5, 1..5)
// The row labels are non-strictly increasing.
index = temporalAdd(2000.01.01, [1, 1, 2, 2, 3], "d")
m.rename!(index, `A`B);
m.resample(rule=`D, func=sum);
label A B
2000.01.02 3 3
2000.01.03 7 7
2000.01.04 5 5