# array vector {#array-vector}

In DolphinDB, an array vector is a two-dimensional vector where each row is a variable-length vector. With array vectors, multiple columns with the same data type, such as columns of a stock's best ask prices, can be stored in one single column. This can significantly reduce the complexity of some common queries and calculations. If there are a lot of duplicate data in multiple columns, it is also recommended to use an array vector to store these data as it improves data compression ratio and improves query performance.

Binary operations can be performed between an array vector and a scalar, a vector or another array vector.

Currently, array vectors can only be used in in-memory tables and DFS tables created with the TSDB storage engine. The OLAP storage engine does not support array vectors. Also, columns holding array vectors cannot be specified as the partitioning columns or sort columns \(TSDB engine\) of a partitioned table.

## Creating an Array Vector {#creating-an-array-vector}

Use functions [array](../../../../Functions/a/array.md) or [bigarray](../../../../Functions/b/bigarray.md) to initialize an empty array vector. When specifying the parameter *dataType*, append a pair of square brackets \("\[\]"\) to the data type to indicate the creation of an array vector, for example, INT\[\]. The parameter *initialSize* must be 0.

Note that only the following data type categories are allowed: Logical, Integral \(excluding INT128 and COMPRESSED\), Floating, DECIMAL, Temporal.

## Conversion between Vectors and Array Vectors {#conversion-between-vectors-and-array-vectors}

-   From vectors to array vectors

    1.  Use `arrayVector` to convert a vector to an array vector.

    2.  Use `fixedLengthArrayVector` to combine multiple vectors into one array vector.

-   From array vectors to vectors:

    Use [flatten](../../../../Functions/f/flatten.md) to convert an array vector to a vector.

    ``` {#codeblock_vsm_gnz_czb}
    a=array(INT[], 0, 10).append!([1 2 3, 4 5, 6 7 8, 9 10]);
    a
    // output: [[1,2,3],[4,5],[6,7,8],[9,10]]
    
    flatten(a)
    // output: [1,2,3,4,5,6,7,8,9,10]
    ```


## Convert an Array Vectors into a matrix {#convert-an-array-vectors-into-a-matrix}

Use function [matrix](../../../../Functions/m/matrix.md) to convert array vectors with each element of same length into matrices for further calculations.

```
bid = array(DOUBLE[], 0, 20).append!([1.4799 1.479 1.4787 1.4784 1.4667, 1.4796 1.479 1.4782 1.4781 1.4783, 1.4791 1.479 1.4785 1.4698 1.4720, 1.4699 1.469 1.4707 1.4704 1.4697, 1.4789 1.477 1.4780 1.4724 1.4669])
ask = array(DOUBLE[], 0, 20).append!([1.4821 1.4825 1.4828 1.4900 1.4792, 1.4818 1.482 1.4821 1.4818 1.4829, 1.4814 1.4818 1.482 1.4825 1.4823, 1.4891 1.4885 1.4898 1.4901 1.4799, 1.4811 1.4815 1.4818 1.4800 1.4799])

TradeDate = 2022.01.01 + 1..5
SecurityID = rand(`APPL`AMZN`IBM, 5)
t = table(SecurityID as `sid, TradeDate as `date, bid as `bid, ask as `ask)
result = select toArray(matrix(bid).corrMatrix()) as corr_bid,  toArray(matrix(ask).corrMatrix()) as corr_ask from t group by sid
corrMatrix_bid = result.corr_bid.matrix().avg().reshape(5:5)
corrMatrix_ask = result.corr_ask.matrix().avg().reshape(5:5)
corrMatrix_bid;
```

|col1|col2|col3|col4|col5|
|----|----|----|----|----|
|1|0.9995|0.9984|0.7417|\(0.1308\)|
|0.9995|1|0.9997|0.7214|\(0.1466\)|
|0.9984|0.9997|1|0.7061|\(0.1589\)|
|0.7417|0.7214|0.7061|1|\(0.0264\)|
|\(0.1308\)|\(0.1466\)|\(0.1589\)|\(0.0264\)|1|

```
corrMatrix_ask;
```

|col1|col2|col3|col4|col5|
|----|----|----|----|----|
|1|0.9999|0.9997|0.9963|\(0.9861\)|
|0.9999|1|0.9999|0.9974|\(0.9883\)|
|0.9997|0.9999|1|0.9981|\(0.9899\)|
|0.9963|0.9974|0.9981|1|\(0.9968\)|
|\(0.9861\)|\(0.9883\)|\(0.9899\)|\(0.9968\)|1|

## Appending Data to an Array Vector {#appending-data-to-an-array-vector}

Use [append!](../../../../Functions/a/append!.md) to append scalar, vector or array vector to an array vector. Updates and deletes are currently not supported for array vectors.

```
// append a scalar
a=array(DOUBLE[], 0, 10).append!(2.5);
print a;
// output: [[2.5]]]

// append a tuple
a=array(DATE[], 0, 10).append!((2021.10.15, 2021.10.16, 2021.10.17));
print a;
// output: [[2021.10.15],[2021.10.16],[2021.10.17]]

// append an array vector
a=array(INT[], 0, 10).append!([1 2 3, 4 5, 6 7 8, 9 10]);
print a;
// output: [[1,2,3],[4,5],[6,7,8],[9,10]]
```

**Note:** When using `append!` to append data to array vectors, if an empty row \[\] is inserted, it is filled with \[null\]. The size of \[\] is 0, while \[null\] is 1.

``` {#codeblock_ebt_qq1_y2c}
a=array(INT[]).append!([[1,2,3],[],[9,10]]);
a
// output: [[1,2,3],[null],[9,10]]
```

## Retrieving Data from Array Vectors {#retrieving-data-from-array-vectors}

-   Array Vector

    ``` {#codeblock_ot2_cnz_czb}
    a =array(DOUBLE[], 0, 10).append!([1.0, 2.1 4.1 6.8, 0.5 2.2 2]);
    print a
    // output: [[1],[2.1,4.1,6.8],[0.5,2.2,2]]
    typestr(a)
    // output: FAST DOUBLE[] VECTOR
    ```

-   Tuple

    ``` {#codeblock_x5n_cnz_czb}
    tp = [[1],[2.1,4.1,6.8],[0.5,2.2,2]]
    typestr(tp)
    // output: ANY VECTOR
    ```


Despite some similarities, the way data is retrieved from a tuple and from an array vector is different.

Similarities:

You can use the [row](../../../../Functions/r/row.md) function to retrieve rows from both an array vector and a tuple.

Differences:

Array vectors only have column index. Therefore, to identify a column in the array vector, use a\[index\] where the index starts from 0. To identify a row in the array vector, we must use the row function via a.row\(index\).

To retrieve an individual element from a row in the array vector, first identify the column then the row via a\[index1\]\[indexe2\]; or first identify the row then the column via a.row\(index1\)\[index2\].

```
a[1]
// output: [,4.1,2.2]

tp[,1]
// output: (,4.1,2.2)

a.row(1)
// output: [2.1,4.1,6.8]

tp[1]
// output: [2.1,4.1,6.8]
```

## Modifying Data from Array Vectors {#topic_acr_g2f_m1c}

Specific elements in array vectors can be modified by referring to their row and column indices: `obj[rowIndex, colIndex] = newValue`.

**rowIndex**\(optional\) is a scalar or pair. If it is a pair and the lower bound is not specified, the range starts from the first row; if the upper bound is not specified, the range ends at the last row. If *rowIndex* is not specified, all rows of data will be modified.

**colIndex** is a scalar or pair. If it is a pair, its lower and upper bounds must be specified.

**newValue** is a scalar, vector, or pair:

-   If *newValue* is a scalar, all data specified by *rowIndex* and *colIndex* will be modified to *newValue*;

-   If *newValue* is a vector or tuple with the same length as *colIndex*, the data specified by *colIndex* in each row will be modified to *newValue*;

-   If *newValue* is a vector or tuple with the same length as *rowIndex* and contains elements that are scalars or vectors matching the length of *colIndex*, the data in each row will be modified to the corresponding element in *newValue*.


``` {#codeblock_p5y_32f_m1c}
// Create an array vector
a = array(DOUBLE[], 0, 10).append!([1 2 3 4, 5 6 7 8 , 9 10 11 12]);
print a 
// output: [[1,2,3,4],[5,6,7,8],[9,10,11,12]]

// Modify the data in the first row and second column to 50
a[0,1] = 50
print a 
// output: [[1,50,3,4],[5,6,7,8],[9,10,11,12]]

// Modify the data in the first to second rows and second to third columns to 100 200
a[0:2, 1:3] = 100 200
print a
// output: [[1,100,200,4],[5,100,200,8],[9,10,11,12]]

// Modify the data in the first to second rows and second to third columns to 300 400, with the lower bound of colIndex unspecified
a[:2, 1:3] = 300 400
print a
// output: [[1,300,400,4],[5,300,400,8],[9,10,11,12]]

// Modify the data in the second to third columns of the first row to 500 600, and that of the second row to 700 800
a[0:2, 1:3] = (500 600, 700 800)
print a
// output: [[1,500,600,4],[5,700,800,8],[9,10,11,12]]

// Modify the data in the second to third columns of all rows to 900 1000
a[, 1:3] = 900 1000
print a
// output: [[1,900,1000,4],[5,900,1000,8],[9,900,1000,12]]
```

## Importing Columns as Array Vectors \(2.00.4 and later versions\) {#importing-columns-as-array-vectors-2004-and-later-versions}

Currently, DolphinDB doesn't support importing multiple columns from a text file into a single column. To import array vectors, combine the associated columns into one and separate the values in each row with a user-specified delimiter in the text file. For example, combine the 5 best ask prices \(populated in 5 columns\) into one column by "1.4799\|1.479\|1.4787\|1.4796\|1.479".

When using `loadText` \(`ploadText`\) or `loadTextEx` to import data, specify the parameter arrayDelimiter \(which is "\|" in the above example\) to separate the values in the column holding the array vector, so the values in the column can be recognized as array vector during the import.

Note: Before the import, use the schema parameter to update the data type of the columns holding the array vectors with the corresponding array vector data type.

```
bid = array(DOUBLE[], 0, 20).append!([1.4799 1.479 1.4787, 1.4796 1.479 1.4784, 1.4791 1.479 1.4784])
ask = array(DOUBLE[], 0, 20).append!([1.4821 1.4825 1.4828, 1.4818 1.482 1.4821, 1.4814 1.4818 1.482])
TradeDate = 2022.01.01 + 1..3
SecurityID = rand(`APPL`AMZN`IBM, 3)
t = table(SecurityID as `sid, TradeDate as `date, bid as `bid, ask as `ask)
t;
saveText(t,filename="/home/t.csv",delimiter=',',append=true)
path = "/home/t.csv"
schema=extractTextSchema(path);
update schema set type = "DOUBLE[]" where name="bid" or name ="ask"
t = loadText(path, schema=schema, arrayDelimiter=",")
t;
```

|sid|date|bid|ask|
|---|----|---|---|
|AMZN|2022.01.02|\[1.4799,1.479,1.4787\]|\[1.4821,1.4825,1.4828\]|
|AMZN|2022.01.03|\[1.4796,1.479,1.4784\]|\[1.4818,1.482,1.4821\]|
|IBM|2022.01.04|\[1.4791,1.479,1.4784\]|\[1.4814,1.4818,1.482\]|

## Combining Columns into an Array Vector {#combining-columns-into-an-array-vector}

Use `fixedLengthArrayVector` to store multiple columns from a table as a single column.

```
syms="A"+string(1..30)
datetimes=2019.01.01T00:00:00..2019.01.31T23:59:59
n=200
if(existsDatabase("dfs://stock")) {
dropDatabase("dfs://stock")
}
db=database("dfs://stock",RANGE,cutPoints(syms,3));
t=table(take(datetimes,n) as trade_time, take(syms,n) as sym,take(500+rand(10.0,n), n) as bid1, take(500+rand(20.0,n),n) as bid2)
quotes=db.createPartitionedTable(t,`quotes,`sym).append!(t)
t1=select sym, fixedLengthArrayVector(bid1,bid2) as bid from quotes
```

## Concatenating Grouped Values into an Array Vector {#concatenating-grouped-values-into-an-array-vector}

The toArray function is used in conjunction with the `group by` clause in a SQL query. toArray concatenates the values in each group to create a row of the array vector, so the values of each group can be presented in one row.

```
ticker = `AAPL`IBM`IBM`AAPL`AMZN`AAPL`AMZN`IBM`AMZN
volume = 106 115 121 90 130 150 145 123 155;
t = table(ticker, volume);
t;

t1 = select toArray(volume) as volume_all from t group by ticker;
t1;
```

|ticker|volume\_all|
|------|-----------|
|AAPL|\[106,90,150\]|
|AMZN|\[130,145,155\]|
|IBM|\[115,121,123\]|

## Applying Row-Based Functions on Array Vectors {#applying-row-based-functions-on-array-vectors}

\(1\) Aggregate calculation

Support row-based functions, such as [rowMax](../../../../Functions/r/rowMax.md), [rowStd](../../../../Functions/r/rowStd.md) , etc.

Support higher-order function [byRow](../../../../Functions/Templates/byRow.md).

```
a=array(DOUBLE[], 0, 10).append!([8.3 1.2 5.6, 1.8 3.3, 0.1 2.4 6.8]);
rowMax(a);
// output: [8.3, 3.3, 6.8]


bid = array(DOUBLE[], 0, 20).append!([1.4799 1.479 1.4787, 1.4796 1.479 1.4784, 1.4791 1.479 1.4784])
ask = array(DOUBLE[], 0, 20).append!([1.4821 1.4825 1.4828, 1.4818 1.482 1.4821, 1.4814 1.4818 1.482])
TradeDate = 2022.01.01 + 1..3
SecurityID = rand(`APPL`AMZN`IBM, 3)
t = table(SecurityID as `sid, TradeDate as `date, bid as `bid, ask as `ask)
t;
```

|sid|date|bid|ask|
|---|----|---|---|
|IBM|2022.01.02|\[1.4799,1.479,1.4787\]|\[1.4821,1.4825,1.4828\]|
|APPL|2022.01.03|\[1.4796,1.479,1.4784\]|\[1.4818,1.482,1.4821\]|
|APPL|2022.01.04|\[1.4791,1.479,1.4784\]|\[1.4814,1.4818,1.482\]|

```
select SecurityID, TradeDate, bid, ask,  rowAvg(bid) as bid_avg, rowAvg(ask) as ask_avg from t
```

|SecurityID|TradeDate|bid|ask|bid\_avg|ask\_avg|
|----------|---------|---|---|--------|--------|
|IBM|2022.01.02|\[1.4799,1.479,1.4787\]|\[1.4821,1.4825,1.4828\]|1.4792|1.4825|
|APPL|2022.01.03|\[1.4796,1.479,1.4784\]|\[1.4818,1.482,1.4821\]|1.479|1.482|
|APPL|2022.01.04|\[1.4791,1.479,1.4784\]|\[1.4814,1.4818,1.482\]|1.4788|1.4817|

\(2\) Window functions \(moving window, cumulative window, etc.\)

Use the higher-order function [byRow](../../../../Functions/Templates/byRow.md) to perform window functions on an array vector.

```
select SecurityID, TradeDate, bid, ask, byRow(cumavg, bid) as bid_cum, byRow(cumavg, ask) as ask_cum from t
```

|SecurityID|TradeDate|bid|ask|bid\_cum|ask\_cum|
|----------|---------|---|---|--------|--------|
|IBM|2022.01.02|\[1.4799,1.479,1.4787\]|\[1.4821,1.4825,1.4828\]|\[1.4799,1.47945,1.4792\]|\[1.4821,1.4823,1.482467\]|
|APPL|2022.01.03|\[1.4796,1.479,1.4784\]|\[1.4818,1.482,1.4821\]|\[1.4796,1.4793,1.479\]|\[1.4818,1.4819,1.481967\]|
|APPL|2022.01.04|\[1.4791,1.479,1.4784\]|\[1.4814,1.4818,1.482\]|\[1.4791,1.47905,1.478833\]|\[1.4814,1.4816,1.481733\]|

