# regroup {#regroup}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`regroup(X, label, func, [byRow=true])`

## Arguments {#arguments}

**X** is a matrix.

**label** is a vector indicating the column/row labels based on which the matrix is grouped and aggregated. When *byRow* = true, the length of label must match the number of rows of *X*. Otherwise, it must match the number of columns of *X*.

**func** is a unary aggregate function called on each group of the matrix. It can be built-in or user-defined.

**byRow** \(optional\) is a Boolean. The default value is true, indicating that the matrix will be grouped and aggregated by rows. False means to group and aggregate matrix by columns.

## Details {#details}

Group the data of a matrix based on user-specified column/row labels and apply aggregation on each group.

`regroup` is similar to the SQL keyword "group by", except that "group by" is applied only on *tables* whereas this function is applied on *matrices*.

Note: It is recommended that the *func* parameter be specified as a built-in aggregate function as built-in functions are optimized internally for optimal performance. \(see Example 2\)

## Examples {#examples}

Example 1. Perform grouped aggregation on a matrix by row/column labels.

```
m = rand(20, 4:5)
m;
```

|col1|col2|col3|col4|col5|
|----|----|----|----|----|
|11|6|6|10|4|
|6|7|5|2|16|
|2|16|14|19|9|
|17|6|13|10|2|

```
// by column labels
label = `A`A`B`A`B
regroup(X=m, label=label, func=firstNot, byRow=false)
```

|A|B|
|---|---|
|11|6|
|6|5|
|2|14|
|17|13|

```
// by row labels
label = 1 2 1 2
regroup(X=m, label=label, func=firstNot, byRow=true)
```

|label|col1|col2|col3|col4|col5|
|-----|----|----|----|----|----|
|1|11|6|6|10|4|
|2|6|7|5|2|16|

Example 2. Compare the performance between applying a built-in function and a user-defined function.

```
m = rand(1000.0, 10000)$100:100
defg my_avg(v):avg(v)

timer(1000) regroup(m, take(1 2 3 4 5, 100), avg)
// Time elapsed 176.175 ms

timer(1000) regroup(m, take(1 2 3 4 5, 100), my_avg)
// Time elapsed 1062.553 ms
```

Example 3. Aggregate the panel data by minute

```
n=1000
timestamp = 09:00:00 + rand(10000, n).sort!()
id = take(`st1`st2`st3, n)
vol = 100 + rand(10.0, n)
vt = table(timestamp, id, vol)
m = exec vol from vt pivot by timestamp, id
regroup(m, minute(m.rowNames()), avg)
```

