byColumn

Syntax

byColumn(func, X, [Y])

Or

func:V(X)

Or

func:V(X, [Y])

Arguments

func is a unary function. When function with multiple parameters is specified for func, partial application is used to fix part of the parameters. It can be a vector function (where the input vector and output vector are of equal length) or an aggregate function.

X is a matrix/table/tuple/array vector/columnar tuple.

Y is a matrix/table/tuple/array vector/columnar tuple.

Details

If func is a unary function, apply the specified function to each column of X; if func is a binary function, apply func(Xi, Yi) to each column of X and Y.

byColumn can be used in a reactive state engine.

Calculation rules:

  • If X/Y is a matrix, table, or tuple, byColumn applies func to each column of X/Y.

  • If X/Y is an array vector or columnar tuple, byRow applies func to each row of the transpose of X/Y.

    • If func is a vector function, byColumn returns the transpose of the result.

    • If func is an aggregate function, byColumn directly returns a vector. Certain aggregate functions in DolphinDB are optimized to work natively by column, requiring no transpose of the input X/Y. These include: sum, sum2, avg, min, max, count, imax, imin, imaxLast, iminLast, prod, std, stdp, var, varp, skew, kurtosis, any, all, corr, covar, wavg, wsum, beta, euclidean, dot, tanimoto.

Return values:

  • If func is an aggregate function

    • If X/Y is a matrix, array vector, or columnar tuple, byColumn returns a vector of the same size as the number of columns in X/Y.

    • If X/Y is a tuple, byColumn returns a tuple.

    • If X/Y is a table, byColumn returns a table.

  • If func is a vector function, byColumn returns a result with the same form and dimension as X/Y.

Examples

When func is a unary function that does not support matrix operations, the byColumn function is equivalent to each.

def myvfunc(x): var(x).log()
m = matrix(1.1 2.3 2.1 3.5 4.2, 3.3 2.5 4.2 5.1 0, -1 3.3 2 1.7 2.3)
byColumn(myvfunc, m)
// output
[0.3974329364109,1.334211281249665,0.945072533299607]

To specify a function with multiple parameters for func:

byColumn(add{2}, m)
// output
col1 col2 col3
3.1  5.3  1
4.3  4.5  5.3
4.1  6.2  4
5.5  7.1  3.7
6.2  2    4.3

byColumn(add{1 2 3 4 5}, m)
// output
col1 col2 col3
2.1  4.3  0
4.3  4.5  5.3
5.1  7.2  5
7.5  9.1  5.7
9.2  5    7.3

When func is a user-defined function:

def my_func(x){
return iif(x > 0, 1, -1)
}
m = matrix(3 -6 5 0, 2 -9 -4 5)
byColumn(my_func, m)
// output
col1 col2
1    1
-1   -1
1    -1
-1    1

When func is a nested function:

m = matrix(1 5 3 , 7 5 2)
byColumn(accumulate{def (x, y):iif(x > 5, y-1, y+1), ,1}, m)
// output
col1 col2
2    8
6    4
2    3

If func is a multivariate function, it is necessary to fix part of the parameters using partial application.

byColumn(autocorr{,2},m)
// output
[-0.05,-0.28,-0.06]

X/Y is a matrix.

m=matrix([1 3 4 2,1 2 2 1])
max:V(m)
// output
[4,2]

cummax:V(m)
// output
col1	col2
1	1
3	2
4	2
4	2

n=matrix([11 5 9 2,8 5 3 2])
corr:V(m,n)
// output
[-0.09,-0.21]

X/Y is a table.

qty1 = 2200 1900 2100 3200 6800 5400 1300 2500 8800
qty2 = 2100 1800  6800 5400 1300 2400 8500 4100 3200
t = table(qty1, qty2);
max:V(t) 
// output
qty1	qty2
8,800	8,500

cummax:V(t) 
// output
qty1	qty2
2,200	2,100
2,200	2,100
2,200	6,800
3,200	6,800
6,800	6,800
6,800	6,800
6,800	8,500
6,800	8,500
8,800	8,500


qty3 = 7800 5400 5300 2500 1800 2200 3900 3100 1200
qty4 = 3200 2800 6400 8300 2300 3800 2900 1600 2900
t1 = table(qty3, qty4);
corr:V(t,t1)   
// output
qty1	qty2
-0.7267	0.4088

X/Y is a tuple.

tp=[1 3 4 2,1 2 2 1]
sum:V(tp)  
// output
(10,6) 

cummax:V(tp) 
// output
([1,3,4,4],[1,2,2,2]) 

tp1=[11 23 14 21,10 12 32 21]
corr:V(tp,tp1)
// output  
(0.25,0.37)

X/Y is an array vector.

a=array(INT[], 0, 10).append!([1 2 3, 4 5 4, 6 7 8, 1 9 10]);
sum:V(a) 
// output
[12,23,25] 

cummax:V(a)
// output
[[1,2,3],[4,5,4],[6,7,8],[6,9,10]]

b=array(DOUBLE[], 0, 10).append!([11.8 21.2 23.9, 83.3 90.2 78.2, 86.5 52 36.5, 10.1 12.4 16.8])
corr:V(a,b)  
// output
[0.95,-0.13,-0.46]

X/Y is a columnar tuple.

ctp=[1 3 4 2,1 2 2 1]
ctp.setColumnarTuple!()
sum:V(ctp) 
// output 
[2,5,6,3]

cummax:V(ctp)
// output
([1,3,4,2],[1,3,4,2])

ctp1=[11 23 14 21,10 12 32 21]
ctp1.setColumnarTuple!()
corr:V(ctp,ctp1) 
// output
[,1,-1,]

Related function: byRow