# sql {#sql}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort], [having], [orderBy], [ascOrder], [limit], [hint], [exec=false], [map=false])`

## Arguments {#arguments}

**select** is metacode indicating the columns to be selected. Each column is generated by either function [sqlCol](sqlCol.md) or [sqlColAlias](sqlColAlias.md) . Use a tuple to select multiple columns.

**from** is a table object or table name.

**where** \(optional\) indicates the "where" conditions. In case of multiple "where" conditions, use an ANY vector with each element corresponding to the metacode of a condition.

**groupBy** \(optional\) indicates "group by" or "context by" column\(s\). In case of multiple "group by" columns, use an ANY vector with each element corresponding to the metacode of a column name.

**groupFlag** \(optional\) 1 means "group by"; 0 means "context by"; 2 means "pivot by". The default value is 1.

**csort** \(optional\) is metacode or a tuple of metacode that specifies the column name\(s\) followed by `csort`. This parameter only works when *contextBy*is specified.

**ascSort**\(optional\) is a scalar or vector indicating whether each *csort*column is sorted in ascending or descending order. 1 \(default\) means ascending and 0 means descending.

**having**\(optional\) is metacode or a tuple of metacode that specifies the `having` condition\(s\). This parameter only works when *contextBy*is specified.

**orderBy** \(optional\) indicates "order by" column\(s\). In case of multiple "order by" columns, use a tuple with each element corresponding to the metacode of a column name.

**ascOrder** \(optional\) is a scalar or vector indicating whether each "order by" column is sorted in ascending or descending order. 1 means sorting in ascending order; 0 means sorting in descending order. The default value is 1.

**limit** \(optional\) is an integer or an integral pair indicating the number of rows to select from the result starting from the first row. If *groupBy* is specified and *groupFlag*=0, select limit rows from each group starting from the first row in each group. It corresponds to "top" clause in "select" statements.

**limit** \(optional\) is an integer or an integer pair that specifies the range of rows to return. It corresponds to "top" clause in "select" statements.

-   If *limit* is a integer n, it returns the first n rows.

-   If *limit* is a pair "start:end", it returns rows from the start-th row \(inclusive\) to the end-th row \(exclusive\).


If *groupBy* is specified and *groupFlag* = 0, the *limit* is applied within each group independently.

**hint** \(optional\) is a constant that can take the following values:

-   HINT\_HASH: use Hashing algorithm to execute "group by" statements.

-   HINT\_SNAPSHOT: query data from snapshot engine.

-   HINT\_KEEPORDER: the records in the result after executing "context by" statements are in the same order as in the input data.


**exec** indicates whether to use the exec clause. The default value is false. If set to be true, a scalar or a vector will be generated. If the "pivot by" is used in the exec clause, a matrix can be generated.

**map**\(optional\) is a Boolean scalar specifying whether to use the `map` keyword. The default value is false.

## Details {#details}

Create a SQL statement dynamically. To execute the generated SQL statement, use function [eval](../e/eval.md).

## Examples {#examples}

```
symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
price=31.82 31.69 31.92 31.8  31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
t1 = table(symbol, date, price, volume);

t1;
```

|symbol|date|price volume|
|------|----|------------|
|GE|2017.01.03|31.82 2300|
|GE|2017.01.03|31.69 3500|
|GE|2017.01.04|31.92 3700|
|GE|2017.01.04|31.8 2100|
|GE|2017.01.04|31.75 1200|
|GE|2017.01.04|31.76 4600|
|MSFT|2017.01.03|63.12 1800|
|MSFT|2017.01.03|62.58 3800|
|MSFT|2017.01.04|63.12 6400|
|MSFT|2017.01.04|62.77 4200|
|MSFT|2017.01.04|61.86 2300|
|MSFT|2017.01.04|62.3 6800|
|F|2017.01.03|12.46 4200|
|F|2017.01.03|12.59 5600|
|F|2017.01.04|13.24 8900|
|F|2017.01.04|13.41 2300|
|F|2017.01.04|13.36 6300|
|F|2017.01.04|13.17 9600|

```
x=5000
whereConditions = [<symbol=`MSFT>,<volume>x>]
havingCondition = <sum(volume)>200>;

sql(sqlCol("*"), t1);
// output: < select * from t1 >

sql(sqlCol("*"), t1, whereConditions);
// output: < select * from t1 where symbol == "MSFT",volume > x >

sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
// output: < select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >

sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
// output: < select avg(price) as avg_price from t1 group by date,symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >

sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t1 where symbol == "MSFT",volume > x order by date desc >

sql(select=sqlCol("*"), from=t1, limit=1);
// output: < select top 1 * from t1 >

sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
// output: < select top 1 * from t1 context by symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
// output: < select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

whereConditions1 = <symbol=`MSFT or volume>x>
sql(select=sqlCol("*"), from=t1, where=whereConditions1, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t14059d76a00000000 where symbol == "MSFT" or volume > x order by date desc > 

sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0, map=true);
// output: < select [256] * from t17092a30500000000 where symbol == "MSFT",volume > x order by date desc map >
```

A convenient and flexible way to generate complicated queries dynamically is to define a function that calls function `sql`.

```
def f1(t, sym, x){
whereConditions=[<symbol=sym>,<volume>x>]
return sql(sqlCol("*"),t,whereConditions).eval()
};

f1(t1, `MSFT, 5000);
```

|symbol|date|price volume|
|------|----|------------|
|MSFT|2017.01.04|63.12 6400|
|MSFT|2017.01.04|62.3 6800|

```
f1(t1, `F, 9000);
```

|symbol|date|price volume|
|------|----|------------|
|F|2017.01.04|13.17 9600|

```
def f2(t, sym, colNames, filterColumn, filterValue){
 whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
    return sql(sqlCol(colNames),t,whereConditions).eval()
};
```

```
f2(t1,`GE, `symbol`date`volume, `volume, 3000);
```

|symbol|date|volume|
|------|----|------|
|GE|2017.01.03|3500|
|GE|2017.01.04|3700|
|GE|2017.01.04|4600|

```
f2(t1,`F, `symbol`date`volume,`price,13.2);
```

|symbol|date|volume|
|------|----|------|
|F|2017.01.04|8900|
|F|2017.01.04|2300|
|F|2017.01.04|6300|

Set the parameter *exec*=true and use `exec` clause with the pivot by statement to generate a matrix:

```
date = 2020.09.21 + 0 0 0 0 1 1 1 1
sym = `MS`MS`GS`GS`MS`MS`GS`GS$SYMBOL
factorNum = 1 2 1 2 1 2 1 2
factorValue = 1.2 -3.4 -2.5 6.3 1.1 -3.2 -2.1 5.6
t = table(date, sym, factorNum, factorValue);
sql(select=sqlCol(`factorValue), from=t, groupBy=[sqlCol(`date), sqlCol(`sym)], groupFlag=2, exec=true)

// output: < exec factorValue from t pivot by date,sym >
```

