multiTableRepartitionDS
Syntax
multiTableRepartitionDS(query, [column], [partitionType], [partitionScheme],
[local=true])
Arguments
query is metacode of SQL statements or a tuple of metacode of SQL statements.
column is a string indicating a column name in query. Function
multiTableRepartitionDS
deliminates data sources based on
column.
partitionType means the type of partition. It can take the value of VALUE or RANGE.
partitionScheme is a vector indicating the partitioning scheme. For details please refer to DistributedComputing.
local is a Boolean value indicating whether to move the data sources to the local node for computing. The default value is true.
Details
Generate a tuple of data sources from multiple tables with a new partitioning design.
If query is metacode of SQL statements, the parameter column must be specified. partitionType and partitionScheme can be unspecified for a partitioned table with a COMPO domain. In this case, the data sources will be determined based on the original partitionType and partitionScheme of column.
If query is a tuple of metacode of SQL statements, column, partitionType and partitionScheme should be unspecified. The function returns a tuple with the same length as query. Each element of the result is a data source corresponding to a piece of metacode in query.
Examples
n=100000
date=rand(2019.06.01..2019.06.05,n)
sym=rand(`AAPL`MSFT`GOOG,n)
price=rand(1000.0,n)
t1=table(date,sym,price)
db=database("dfs://value",VALUE,2019.06.01..2019.06.05)
db.createPartitionedTable(t1,`pt1,`date).append!(t1);
n=100000
date=rand(2019.06.01..2019.06.05,n)
sym=rand(`AAPL`MSFT`GOOG,n)
price=rand(1000.0,n)
qty=rand(500,n)
t2=table(date,sym,price,qty)
db1=database("",VALUE,2019.06.01..2019.06.05)
db2=database("",VALUE,`AAPL`MSFT`GOOG)
db=database("dfs://compo",COMPO,[db1,db2])
db.createPartitionedTable(t2,`pt2,`date`sym).append!(t2);
pt1=loadTable("dfs://value","pt1")
pt2=loadTable("dfs://compo","pt2");
Example 1. Delineate data sources based on the original partitioning scheme. column, partitionType and partitionScheme are unspecified.
ds=multiTableRepartitionDS([<select * from pt1>,<select date,sym,price from pt2>]);
// output
(DataSource< select [7] * from pt1 [partition = /value/20190601] >,DataSource< select [7] * from pt1 [partition = /value/20190602] >, ...... ,DataSource< select [7] date,sym,price from pt2 [partition = /compo/20190605/GOOG] >,DataSource< select [7] date,sym,price from pt2 [partition = /compo/20190605/MSFT] >)
Example 2. Delineate data sources based on stock symbols.
ds=multiTableRepartitionDS([<select * from pt1>,<select date,sym,price from pt2>],`sym,VALUE,`AAPL`MSFT`GOOG);
// output
(DataSource< select [4] * from pt1 where sym == "AAPL" >,DataSource< select [4] * from pt1 where sym == "MSFT" >,DataSource< select [4] * from pt1 where sym == "GOOG" >,DataSource< select [4] date,sym,price from pt2 where sym == "AAPL" >,DataSource< select [4] date,sym,price from pt2 where sym == "MSFT" >,DataSource< select [4] date,sym,price from pt2 where sym == "GOOG" >)
Example 3. Delineate data sources based on dates.
ds=multiTableRepartitionDS([<select * from pt1>,<select date,sym,price from pt2>],`date,RANGE,2019.06.01 2019.06.03 2019.06.05);
// output
(DataSource< select [4] * from pt1 where date >= 2019.06.01,date < 2019.06.03 >,DataSource< select [4] * from pt1 where date >= 2019.06.03,date < 2019.06.05 >,DataSource< select [4] date,sym,price from pt2 where date >= 2019.06.01,date < 2019.06.03 >,DataSource< select [4] date,sym,price from pt2 where date >= 2019.06.03,date < 2019.06.05 >)
Related function: repartitionDS