sql
Syntax
sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort],
[having], [orderBy], [ascOrder], [limit], [hint], [exec=false],
[map=false])
Arguments
select is metacode indicating the columns to be selected. Each column is generated by either function sqlCol or sqlColAlias . Use a tuple to select multiple columns.
from is a table object or table name.
where (optional) indicates the "where" conditions. In case of multiple "where" conditions, use an ANY vector with each element corresponding to the metacode of a condition.
groupBy (optional) indicates "group by" or "context by" column(s). In case of multiple "group by" columns, use an ANY vector with each element corresponding to the metacode of a column name.
groupFlag (optional) 1 means "group by"; 0 means "context by"; 2 means "pivot by". The default value is 1.
csort (optional) is metacode or a tuple of metacode that specifies the column
name(s) followed by csort
. This parameter only works when
contextBy is specified.
ascSort (optional) is a scalar or vector indicating whether each csort column is sorted in ascending or descending order. 1 (default) means ascending and 0 means descending.
having (optional) is metacode or a tuple of metacode that specifies the
having
condition(s). This parameter only works when
contextBy is specified.
orderBy (optional) indicates "order by" column(s). In case of multiple "order by" columns, use a tuple with each element corresponding to the metacode of a column name.
ascOrder (optional) is a scalar or vector indicating whether each "order by" column is sorted in ascending or descending order. 1 means sorting in ascending order; 0 means sorting in descending order. The default value is 1.
limit (optional) is an integer or an integral pair indicating the number of rows to select from the result starting from the first row. If groupBy is specified and groupFlag=0, select limit rows from each group starting from the first row in each group. It corresponds to "top" clause in "select" statements.
hint (optional) is a constant that can take the following values:
-
HINT_HASH: use Hashing algorithm to execute "group by" statements.
-
HINT_SNAPSHOT: query data from snapshot engine.
-
HINT_KEEPORDER: the records in the result after executing "context by" statements are in the same order as in the input data.
exec indicates whether to use the exec clause. The default value is false. If set to be true, a scalar or a vector will be generated. If the "pivot by" is used in the exec clause, a matrix can be generated.
map (optional) is a Boolean scalar specifying whether to use the
map
keyword. The default value is false.
Details
Create a SQL statement dynamically. To execute the generated SQL statement, use function eval.
Examples
symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
price=31.82 31.69 31.92 31.8 31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
t1 = table(symbol, date, price, volume);
t1;
symbol | date | price volume |
---|---|---|
GE | 2017.01.03 | 31.82 2300 |
GE | 2017.01.03 | 31.69 3500 |
GE | 2017.01.04 | 31.92 3700 |
GE | 2017.01.04 | 31.8 2100 |
GE | 2017.01.04 | 31.75 1200 |
GE | 2017.01.04 | 31.76 4600 |
MSFT | 2017.01.03 | 63.12 1800 |
MSFT | 2017.01.03 | 62.58 3800 |
MSFT | 2017.01.04 | 63.12 6400 |
MSFT | 2017.01.04 | 62.77 4200 |
MSFT | 2017.01.04 | 61.86 2300 |
MSFT | 2017.01.04 | 62.3 6800 |
F | 2017.01.03 | 12.46 4200 |
F | 2017.01.03 | 12.59 5600 |
F | 2017.01.04 | 13.24 8900 |
F | 2017.01.04 | 13.41 2300 |
F | 2017.01.04 | 13.36 6300 |
F | 2017.01.04 | 13.17 9600 |
x=5000
whereConditions = [<symbol=`MSFT>,<volume>x>]
havingCondition = <sum(volume)>200>;
sql(sqlCol("*"), t1);
// output: < select * from t1 >
sql(sqlCol("*"), t1, whereConditions);
// output: < select * from t1 where symbol == "MSFT",volume > x >
sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
// output: < select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >
sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
// output: < select avg(price) as avg_price from t1 group by date,symbol >
sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >
sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >
sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >
sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >
sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t1 where symbol == "MSFT",volume > x order by date desc >
sql(select=sqlCol("*"), from=t1, limit=1);
// output: < select top 1 * from t1 >
sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
// output: < select top 1 * from t1 context by symbol >
sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
// output: < select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >
whereConditions1 = <symbol=`MSFT or volume>x>
sql(select=sqlCol("*"), from=t1, where=whereConditions1, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t14059d76a00000000 where symbol == "MSFT" or volume > x order by date desc >
sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0, map=true);
// output: < select [256] * from t17092a30500000000 where symbol == "MSFT",volume > x order by date desc map >
A convenient and flexible way to generate complicated queries dynamically is to
define a function that calls function sql
.
def f1(t, sym, x){
whereConditions=[<symbol=sym>,<volume>x>]
return sql(sqlCol("*"),t,whereConditions).eval()
};
f1(t1, `MSFT, 5000);
symbol | date | price volume |
---|---|---|
MSFT | 2017.01.04 | 63.12 6400 |
MSFT | 2017.01.04 | 62.3 6800 |
f1(t1, `F, 9000);
symbol | date | price volume |
---|---|---|
F | 2017.01.04 | 13.17 9600 |
def f2(t, sym, colNames, filterColumn, filterValue){
whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
return sql(sqlCol(colNames),t,whereConditions).eval()
};
f2(t1,`GE, `symbol`date`volume, `volume, 3000);
symbol | date | volume |
---|---|---|
GE | 2017.01.03 | 3500 |
GE | 2017.01.04 | 3700 |
GE | 2017.01.04 | 4600 |
f2(t1,`F, `symbol`date`volume,`price,13.2);
symbol | date | volume |
---|---|---|
F | 2017.01.04 | 8900 |
F | 2017.01.04 | 2300 |
F | 2017.01.04 | 6300 |
Set the parameter exec=true and use exec
clause with the
pivot by statement to generate a matrix:
date = 2020.09.21 + 0 0 0 0 1 1 1 1
sym = `MS`MS`GS`GS`MS`MS`GS`GS$SYMBOL
factorNum = 1 2 1 2 1 2 1 2
factorValue = 1.2 -3.4 -2.5 6.3 1.1 -3.2 -2.1 5.6
t = table(date, sym, factorNum, factorValue);
sql(select=sqlCol(`factorValue), from=t, groupBy=[sqlCol(`date), sqlCol(`sym)], groupFlag=2, exec=true)
// output: < exec factorValue from t pivot by date,sym >