Metaprogramming
Metaprogramming is a programming technique in which computer programs are treated as data. It means that a program can read, generate, analyze or transform other programs, or even modify itself while running.
DolphinDB supports metaprogramming for dynamic expression generation and delayed evaluation. With metaprogramming, users can generate SQL statements and evaluate them dynamically.
Metacode is objects or expressions within “<” and “>”.
Related functions
1. Function expr generates metacode from objects, operators, or other metacode.
Syntax: expr(X1, X2, ……) where X’s are objects, operators, or other metacode.
$ expr(6,<,8);
< 6 < 8 >
$ expr(sum, 1 2 3);
< sum [1,2,3] >
$ a=6;
$ expr(a,+,1);
< 6 + 1 >
$ expr(<a>,+,1);
< a + 1 >
$ expr(<a>,+,<b>);
< a + b >
$ expr(a+7,*,8);
< 13 * 8 >
$ expr(<a+7>,*,8);
< (a + 7) * 8 >
$ expr(not, < a >);
< ! a >
2. Function eval evaluates the given metacode.
Syntax: eval(<X>) where X is metacode.
$ eval(<1+2>);
3
$ eval(<1+2+3=10>);
0
$ eval(expr(6,<,8));
1
$ eval(expr(sum, 1 2 3));
6
$ a=6; b=9;
$ eval(expr(<a>,+,<b>));
15
3. Function sqlCol converts column names into metaexpressions.
Syntax: sqlCol(colNames), where colNames can be one column name or a vector of column names. Each of these column names need to be quoted.
$ sqlCol(`PRC);
< PRC >
$ sqlCol(["PRC", "RET", "DATE", "TICKER"]);
(< PRC >,< RET >,< DATE >,< TICKER >)
4. Function sqlColAlias uses metacode and an optional alias name to define a column. It is often used for calculated columns.
Syntax: sqlColAlias(<metacode>, [aliasName])
$ sqlColAlias(<x>, `y);
< x as y >
$ sqlColAlias(<avg(PRC)>, `avgPRC);
< avg(PRC) as avgPRC >
$ sqlColAlias(<avg(PRC)>);
< avg(PRC) as avg_PRC >
5. Function sql creates a SQL statement dynamically.
Syntax: sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort], [having], [orderBy], [ascOrder], [limit], [hint])
$ symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
$ date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
$ price=31.82 31.69 31.92 31.8 31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
$ volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
$ t1 = table(symbol, date, price, volume);
$ t1;
symbol |
date |
price |
volume |
---|---|---|---|
GE |
2017.01.03 |
31.82 |
2300 |
GE |
2017.01.03 |
31.69 |
3500 |
GE |
2017.01.04 |
31.92 |
3700 |
GE |
2017.01.04 |
31.8 |
2100 |
GE |
2017.01.04 |
31.75 |
1200 |
GE |
2017.01.04 |
31.76 |
4600 |
MSFT |
2017.01.03 |
63.12 |
1800 |
MSFT |
2017.01.03 |
62.58 |
3800 |
MSFT |
2017.01.04 |
63.12 |
6400 |
MSFT |
2017.01.04 |
62.77 |
4200 |
MSFT |
2017.01.04 |
61.86 |
2300 |
MSFT |
2017.01.04 |
62.3 |
6800 |
F |
2017.01.03 |
12.46 |
4200 |
F |
2017.01.03 |
12.59 |
5600 |
F |
2017.01.04 |
13.24 |
8900 |
F |
2017.01.04 |
13.41 |
2300 |
F |
2017.01.04 |
13.36 |
6300 |
F |
2017.01.04 |
13.17 |
9600 |
$ x=5000
$ whereConditions = [<symbol=`MSFT>,<volume>x>]
$ havingCondition = <sum(volume)>200>;
$ sql(sqlCol("*"), t1);
< select * from t1 >
$ sql(sqlCol("*"), t1, whereConditions);
< select * from t1 where symbol == "MSFT",volume > x >
$ sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
< select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >
$ sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
< select avg(price) as avg_price from t1 group by date,symbol >
$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >
$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >
$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >
$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >
$ sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
< select * from t1 where symbol == "MSFT",volume > x order by date desc >
$ sql(select=sqlCol("*"), from=t1, limit=1);
< select top 1 * from t1 >
$ sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
< select top 1 * from t1 context by symbol >
$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
< select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >
A convenient and flexible way to generate complicated queries dynamically is to define a function that calls function sql
.
$ def f1(t, sym, x){
$ whereConditions=[<symbol=sym>,<volume>x>]
$ return sql(sqlCol("*"),t,whereConditions).eval()
$ };
$ f1(t1, `MSFT, 5000);
symbol |
date |
price |
volume |
---|---|---|---|
MSFT |
2017.01.04 |
63.12 |
6400 |
MSFT |
2017.01.04 |
62.3 |
6800 |
$ f1(t1, `F, 9000);
symbol |
date |
price |
volume |
---|---|---|---|
F |
2017.01.04 |
13.17 |
9600 |
$ def f2(t, sym, colNames, filterColumn, filterValue){
$ whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
$ return sql(sqlCol(colNames),t,whereConditions).eval()
$ };
$ f2(t1,`GE, `symbol`date`volume, `volume, 3000);
symbol |
date |
volume |
---|---|---|
GE |
2017.01.03 |
3500 |
GE |
2017.01.04 |
3700 |
GE |
2017.01.04 |
4600 |
$ f2(t1,`F, `symbol`date`volume,`price,13.2);
symbol |
date |
volume |
---|---|---|
F |
2017.01.04 |
8900 |
F |
2017.01.04 |
2300 |
F |
2017.01.04 |
6300 |
6. Function partial creates a partial application.
Syntax: partial(func, args…)
$ partial(add,1)(2);
3
$ def f(a,b):a pow b
$ g=partial(f, 2)
$ g(3);
8
7. Function makeCall calls a function with the specified parameters to generate a piece of script. The difference between template functions call and makecall is that makecall doesn’t execute the script.
In the following example, we create a function generateReport that reports selected columns from a given table with specified format.
$ def generateReport(tbl, colNames, colFormat): sql(sqlColAlias(each(makeCall{format},sqlCol(colNames),colFormat),colNames), tbl).eval()
$ t = table(1..100 as id, (1..100 + 2018.01.01) as date, rand(100.0, 100) as price, rand(10000, 100) as qty, rand(`A`B`C`D`E`F`G, 100) as city);
$ t;
id |
date |
price |
qty |
city |
---|---|---|---|---|
1 |
2018.01.02 |
99.662328 |
6002 |
D |
2 |
2018.01.03 |
24.965461 |
5836 |
B |
3 |
2018.01.04 |
77.006418 |
5 |
G |
4 |
2018.01.05 |
93.930603 |
5141 |
G |
5 |
2018.01.06 |
70.994914 |
5778 |
C |
6 |
2018.01.07 |
11.221769 |
9403 |
G |
7 |
2018.01.08 |
59.387621 |
4632 |
G |
8 |
2018.01.09 |
86.830752 |
4574 |
C |
9 |
2018.01.10 |
53.928317 |
8397 |
G |
10 |
2018.01.11 |
21.123212 |
6615 |
B |
… |
$ generateReport(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);
id |
date |
price |
qty |
---|---|---|---|
1 |
01/02/2018 |
16.32 |
9,972 |
2 |
01/03/2018 |
63.10 |
7,785 |
3 |
01/04/2018 |
30.43 |
3,629 |
4 |
01/05/2018 |
33.57 |
2,178 |
5 |
01/06/2018 |
61.12 |
9,406 |
6 |
01/07/2018 |
43.29 |
6,955 |
7 |
01/08/2018 |
54.97 |
9,914 |
8 |
01/09/2018 |
86.20 |
6,696 |
9 |
01/10/2018 |
90.82 |
6,141 |
10 |
01/11/2018 |
4.29 |
3,774 |
… |
The equivalent SQL script for the previous execution is:
$ def generateReportSQL(tbl, colNames, colFormat): sql(sqlColAlias(each(makeCall{format},sqlCol(colNames),colFormat),colNames), tbl)
$ generateReportSQL(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);
< select format(id, "0") as id,format(date, "MM/dd/yyyy") as date,format(price, "0.00") as price,format(qty, "#,###") as qty from t >
3 Ways to construct a function to execute a SQL statement:
Example 1:
$ def f1(t, sym, x): select * from t where name=sym, vol>x;
$ f1(t1, `MSFT, 7000);
name |
date |
PRC |
vol |
---|---|---|---|
MSFT |
2017.01.03 |
63.12 |
8800 |
MSFT |
2017.01.03 |
62.58 |
7800 |
Alternatively, we can use the function eval with a block of metacode.
Example 2:
$ def f2(t, sym, x): eval(<select * from t where name=`MSFT, vol>x >);
$ f2(t1,`MSFT, 7000);
name |
date |
PRC |
vol |
---|---|---|---|
MSFT |
2017.01.03 |
63.12 |
8800 |
MSFT |
2017.01.03 |
62.58 |
7800 |
We can also define a function to generate a SQL statement with function sql. The SQL statement will be executed when the function is called. For more complicated query generation, this is much more convenient and flexible.
Example 3
$ def f3(t, sym, x){
$ whereConditions=[<name=sym>,<vol>x>]
$ return sql(sqlCol("*"),t,whereConditions).eval()
$ };
$ f3(t1, `MSFT, 7000);
name |
date |
PRC |
vol |
---|---|---|---|
MSFT |
2017.01.03 |
63.12 |
8800 |
MSFT |
2017.01.03 |
62.58 |
7800 |
$ f3(t1, `F, 9000);
name |
date |
PRC |
vol |
---|---|---|---|
F |
2017.01.04 |
13.17 |
Among the 3 methods, function sql is the most flexible. It can be used dynamically to construct SQL statements.
$ def f4(t, sym, colNames, filterColumn, filterValue){
$ whereConditions=[<name=sym>,expr(sqlCol(filterColumn),>,filterValue)]
$ return sql(sqlCol(colNames),t,whereConditions).eval()
$ };
$ f4(t1,`MSFT, `name`date`vol, `vol, 7000);
name |
date |
vol |
---|---|---|
MSFT |
2017.01.03 |
8800 |
MSFT |
2017.01.03 |
7800 |
$ f4(t1,`F, `name`date`vol,`PRC,13.2);
name |
date |
vol |
---|---|---|
F |
2017.01.04 |
8900 |
F |
2017.01.04 |
2300 |
F |
2017.01.04 |
6300 |
8. Function binaryExpr generates metacode of binary expression.
Syntax: binaryExpr(X, Y, optr)
$ t = table(reshape(rand(1.0, 200), 20:10));
$ n=10;
$ weights = array(ANY, n).fill!(0..(n-1), 1..n)\n;
$ names = t.colNames();
$ tp = binaryExpr(sqlCol(names), weights, *);
$ tp;
(< col0 * 0.1 >,< col1 * 0.2 >,< col2 * 0.3 >,< col3 * 0.4 >,< col4 * 0.5 >,< col5 * 0.6 >,< col6 * 0.7 >,< col7 * 0.8 >,< col8 * 0.9 >,< col9 * 1 >)
9. Function unifiedExpr generates metacode of multivariate expression.
Syntax: unifiedExpr(objs, optrs)
The following example generates metacode of SQL select statement and calculates the weighted sum of each column.
$ selects = sqlColAlias(unifiedExpr(binaryExpr(sqlCol(names), weights, *), take(+, n-1)), "weightedSum");
$ re = sql(selects, t);
$ re;
< select (col0 * 0.1) + (col1 * 0.2) + (col2 * 0.3) + (col3 * 0.4) + (col4 * 0.5) + (col5 * 0.6) + (col6 * 0.7) + (col7 * 0.8) + (col8 * 0.9) + (col9 * 1) as weightedSum from tc0ccbd6a00000000 >
$ re.eval()
weightedSum |
---|
2.1239 |
3.5725 |
3.5009 |
3.3487 |
2.8268 |
1.9753 |
2.4287 |
2.9222 |
3.0469 |
2.9751 |
2.8848 |
2.993 |
2.7185 |
2.2578 |
2.6231 |
2.3871 |
3.0311 |
2.0183 |
2.897 |
2.5982 |
10. Function makeUnifiedCall generates metacode for function call. Different from higher-order functions unifiedCall and makeUnifiedCall , makeUnifiedCall doesn’t execute the metacode.
Syntax: makeUnifiedCall(func, args)
The following example generates metacode equivalent to that of Example 9:
$ selects = sqlColAlias(makeCall(flatten, makeCall(dot, makeUnifiedCall(matrix, sqlCol(names)), 1..n\n)), "weightedSum");
$ re = sql(selects, t);
$ re;
< select flatten(dot(matrix(col0, col1, col2, col3, col4, col5, col6, col7, col8, col9), [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1])) as weightedSum from tc0ccbd6a00000000 >
$ re.eval();
weightedSum |
---|
2.1239 |
3.5725 |
3.5009 |
3.3487 |
2.8268 |
1.9753 |
2.4287 |
2.9222 |
3.0469 |
2.9751 |
2.8848 |
2.993 |
2.7185 |
2.2578 |
2.6231 |
2.3871 |
3.0311 |
2.0183 |
2.897 |
2.5982 |
Use Metaprogramming in reporting
We can use function format for reporting. It has 2 arguments: a table column and the corresponding format string. The format string follows number and datetime formatting in Java. We can use 6 symbols (0/#/,/./%/E) to represent the format of a number and 9 symbols (y/M/d/H/h/m/s/S/n) to represent the format of a temporal object. For examples:
Input |
Format |
Output |
---|---|---|
2012.12.05T15:30:59.125 |
yyyy-MM-dd HH:mm |
2012-12-05 15:30 |
2012.12.05 |
MMMddyyyy |
DEC052012 |
0.1356 |
0.0% |
13.6% |
1234567.42 |
#,###.0 |
1,234,567.4 |
1234567.42 |
0.00####E0 |
1.234567E6 |
The example below generates a report dynamically given a table, a list of column names, and a list of column formats. Metaprogramming functions sqlCol
, sqlColAlias
, sql
, and makeCall
are used to generate a SQL statement and then function eval
is called to generate a report. makeCall
produces a piece of code that makes a function call. The first parameter is a function and the other parameters are required arguments.
$ def generateReport(tbl, colNames, colFormat){
$ colCount = colNames.size()
$ colDefs = array(ANY, colCount)
$ for(i in 0:colCount){
$ if(colFormat[i] == "")
$ colDefs[i] = sqlCol(colNames[i])
$ else
$ colDefs[i] = sqlCol(colNames[i], format{,colFormat[i]})
$ }
$ return sql(colDefs, tbl).eval()
$ }
Generate a sample table with 100 rows and 4 columns (id, date, price, and qty) and then call function generateReport
to generate a report.
$ t = table(1..100 as id, (1..100 + 2018.01.01) as date, rand(100.0, 100) as price, rand(10000, 100) as qty);
$ t;
id |
date |
price |
qty |
---|---|---|---|
1 |
2018.01.02 |
61.483376 |
8733 |
2 |
2018.01.03 |
21.254616 |
8365 |
3 |
2018.01.04 |
37.408301 |
444 |
4 |
2018.01.05 |
70.608384 |
9944 |
5 |
2018.01.06 |
80.361811 |
7185 |
… |
… |
… |
… |
$ generateReport(t, ["id", "date","price","qty"], ["0", "MM/dd/yyyy", "0.00", "#,###"]);
id |
date |
price |
qty |
---|---|---|---|
1 |
01/02/2018 |
61.48 |
8,733 |
2 |
01/03/2018 |
21.25 |
8,365 |
3 |
01/04/2018 |
37.41 |
444 |
4 |
01/05/2018 |
70.61 |
9,944 |
5 |
01/06/2018 |
80.36 |
7,185 |
… |
… |
… |
… |
We can use metaprogramming to generate a filtering condition (i.e., where
clause in a SQL statement).
$ def generateReportWithFilter(tbl, colNames, colFormat, filter){
$ colCount = colNames.size()
$ colDefs = array(ANY, colCount)
$ for(i in 0:colCount){
$ if(colFormat[i] == "")
$ colDefs[i] = sqlCol(colNames[i])
$ else
$ colDefs[i] = sqlCol(colNames[i], format{,colFormat[i]})
$ }
$ return sql(colDefs, tbl, filter).eval()
$ }
$ generateReportWithFilter(t, ["id","date","price","qty"], ["","MM/dd/yyyy", "00.00", "#,###"], < id>10 and price<20 >);
id |
date |
price |
qty |
---|---|---|---|
11 |
01/12/2018 |
11.21 |
7,033 |
18 |
01/19/2018 |
09.97 |
6,136 |
29 |
01/30/2018 |
06.33 |
3,834 |
31 |
02/01/2018 |
05.52 |
6,851 |
38 |
02/08/2018 |
14.55 |
7,482 |