accumulate

Syntax

accumulate(func, X, [init], [assembleRule|consistent=false])

[init] <operator>:A X or func:A([init], X), where assembleRule is not specified and the default value is used.

[init] <operator>:AC X or func:AC([init], X), where assembleRule is specified as C (for "Consistent") as an example.

Arguments

func is a function for iteration.

X is data or iteration rule.

init is the initial value to be passed to func.

assembleRule (optional) indicates how the results of sub-tasks are merged into the final result. It accepts either an integer or a string, with the following options:

0 (or "D"): The default value, which indicates the DolphinDB rule. This means the data type and form of the final result are determined by all sub results. If all sub results have the same data type and form, scalars will be combined into a vector, vectors into a matrix, matrices into a tuple, and dictionaries into a table. Otherwise, all sub results are combined into a tuple.
1 (or "C"): The Consistent rule, which assumes all sub results match the type and form of the first sub result. This means the first sub result determines the data type and form of the final output. The system will attempt to convert any subsequent sub results that don't match the first sub result. If conversion fails, an exception is thrown. This rule should only be used when the sub results' types and forms are known to be consistent. This rule avoids having to cache and check each sub result individually, improving performance.
2 (or "U"): The Tuple rule, which directly combines all sub results into a tuple without checking for consistency in their types or forms.
3 (or "K"): The kdb+ rule. Like the DolphinDB rule, it checks all sub results to determine the final output. However, under the kdb+ rule, if any sub result is a vector, the final output will be a tuple. In contrast, under the DolphinDB rule, if all sub results are vectors of the same length, the final output will be a matrix. In all other cases, the output of the kdb+ rule is the same as the DolphinDB rule.

Note:

Starting from version 2.00.15/3.00.3, the new assembleRule parameter has been introduced. This parameter not only incorporates all the functionality of the original consistent parameter but also offers additional options for combining results.

The consistent parameter is a boolean value that defaults to false, which is equivalent to setting assembleRule="D". When set to true, it’s equivalent to assembleRule="C". For backward compatibility, users can still use the consistent parameter. If both assembleRule and consistent are specified in the same operation, the value of consistent takes precedence.
assembleRule can also be specified after a function pattern symbol, represented by characters D/C/U/K (e.g., sub:PU(X). If not specified, the default value D will be used.

Details

The accumulate template applies func to init and X for accumulating iteration (i.e. the result of an iteration is passed forward to the next). Unlike the template reduce that returns only the last result, the template accumulate outputs result of each iteration.

When func is a unary function, X can be a non-negative integer, a unary function or NULL. In all these cases, init must be specified.

The function first returns init as the initial value, then applies func iteratively until a certain condition specified in X is satisfied.
- If X is an integer, func iterates X times, outputting (X+1) elements. Note that when X is negative, it is treated as 0.
- If X is unspecified or NULL, the iteration continues until the next output is the same as the current one.
- If X is a unary function, it must return a Boolean value to determine the termination of iteration. The iteration continues until X returns false for the current output.
When func is a binary function, X can be a vector, matrix or table.

The function first applies func to init and X[0], and then iterates over the current output and the next element in X. If init is unspecified, the first output is X[0].

accumulate is equivalent to the execution of the pseudo code below:
```
result[0]=iif(init==NULL,X[0],<function>(init,X[0]));

for(i:1~size(X)-1){

    result[i]=<function>(result[i-1], X[i]);

}

return result;
```
When func is a ternary function, X must be a tuple with 2 elements. The iteration rule is the same as that of a binary function.

Examples

Example 1. func is a unary function:

//define a unary function
def func1(x){
    if(x<5){
            return x*3
    }
    else{
            return x+3
    }
}

//when X is an integer, the size of the result is X + 1
accumulate(func1, 5, 1)
// output: [1,3,9,12,15,18]

//X is a unary function "condition". As condition returns false during the 3rd iteration, the system stops iteration and outputs the results of the first two iterations.
def condition(x){
return x<9
}
accumulate(func1, condition, 1)
// output: [1,3,9]

//when X is NULL or unspecified, define a UDF func2 for iteration.
def func2(x){
    if(x<5){
            return x*3
    }
    else{
            return 6
    }
}

//As the results of the 3rd and 4th iterations are the same, the function stops iteration and outputs the results of the first three iterations.
accumulate(func2,NULL,1)
// output: [1,3,9,6]

Example 2. When func is a binary function, accumulate on a vector:

x = 1 2 3;
accumulate(add, 1 2 3);
// output: [1,3,6]
// equivalent to [1, 1+2, 3+3]

1 +:A x;
// output: [2,4,7]
// equivalent to [1+1, 2+2, 4+3]

accumulate(-, 2, x);
// output: [1,-1,-4]
// equivalent to [2-1, 1-2, -1-3]

accumulate(mul, x);
// output: [1,2,6]
// equivalent to [1, 1*2, 2*3]

def facts(a) {return 1*:A 1..a;};
facts 5;
// output: [1,2,6,24,120]
// calculate cumulative factorization

def f1(a,b): a+log(b);
accumulate(f1, 1..5, 0);
// output: [0,0.693147,1.791759,3.178054,4.787492]
// the example above calculates cumulative sum of log(1) to log(i)
// note the result from the previous step will be given to the first parameter of the function.
// 0+log(1)=0, 0+log(2)=0.693147, 0.693147+log(3)=1.791759, ......

accumulate(f1, 1..5);
// output: [1,1.693147,2.791759,4.178053,5.787491]
// since the initial condition is ignored here, the data type of the first element of the input vector determines the date type of the result.

accumulate on a matrix:

x=1..12$3:4;
x;


col1	col2	col3	col4
1	4	7	10
2	5	8	11
3	6	9	12

+ :A x;


col1	col2	col3	col4
1	5	12	22
2	7	15	26
3	9	18	30

Example 3. When func is a ternary function:

def fun3(x,y,z){
  return x+y+z
}
accumulate(fun3,[[1,2,3],[10,10,10]],5)
// output: [16,28,41]

Example 4. Compute a state column based on two flag columns.

Suppose there is a table with two flag columns, flag1 and flag2. We want to construct a state column according to the following logic:

The initial value is 0;
When state == 0 and flag1 == 1, switch the state to 1;
When state == 1 and flag2 == 1, switch the state to 0;
Otherwise, keep the current state unchanged.

id = 1..10
flag1 = [0, 1, 1, 0, 0, 0, 0, 0, 0, 0]
flag2 = [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]
t = table(id, flag1, flag2)

def updateStateByFlags(state, f1, f2): iif(state==0 && f1==1, 1, iif(state==1 && f2==1, 0, state))

select *, 
    accumulate(updateStateByFlags, [flag1, flag2], 0) as state
from t


id	flag1	flag2	state
1	0	0	0
2	1	0	1
3	1	0	1
4	0	0	1
5	0	0	1
6	0	1	0
7	0	1	0
8	0	0	0
9	0	0	0
10	0	0	0