repartitionDS

Syntax

repartitionDS(query, [column], [partitionType], [partitionScheme], [local=true])

Arguments

query is metacode of SQL statements or a tuple of metacode of SQL statements.

column (optional) is a string indicating a column name in query. Function repartitionDS deliminates data sources based on column.

partitionType (optional) means the type of partition. It can take the value of VALUE or RANGE.

partitionScheme (optional) is a vector indicating the partitioning scheme. For details please refer to DistributedComputing.

local (optional) is a Boolean value indicating whether to move the data sources to the local node for computing. The default value is true.

Details

Generate a tuple of data sources from a table with a new partitioning design.

If query is metacode of SQL statements, the parameter column must be specified. partitionType and partitionScheme can be unspecified for a partitioned table with a COMPO domain. In this case, the data sources will be determined based on the original partitionType and partitionScheme of column.

If query is a tuple of metacode of SQL statements, the other 3 parameters should be unspecified. The function returns a tuple with the same length as query. Each element of the result is a data source corresponding to a piece of metacode in query.

Examples

n=1000000
ID=rand(100, n)
dates=2017.08.07..2017.08.11
date=rand(dates, n)
x=rand(10.0, n)
t=table(ID, date, x)

dbDate = database(, VALUE, 2017.08.07..2017.08.11)
dbID = database(, RANGE, 0 50 100)
db = database("dfs://compoDB", COMPO, [dbDate, dbID])
pt = db.createPartitionedTable(t, `pt, `date`ID)
pt.append!(t);

Example 1. query is metacode of SQL statements. partitionType and partitionScheme are specified.

repartitionDS(<select * from pt>,`date,RANGE,2017.08.07 2017.08.09 2017.08.11);
// output
[DataSource< select [4] * from pt where date >= 2017.08.07,date < 2017.08.09 >,DataSource< select [4] * from pt where date >= 2017.08.09,date < 2017.08.11 >]

Example 2. query is metacode of SQL statements. partitionType and partitionScheme are unspecified.

repartitionDS(<select * from pt>,`ID);
// output
[DataSource< select [4] * from pt [partition = */0_50] >,DataSource< select [4] * from pt [partition = */50_100] >]

Example 3. query is a tuple of metacode of SQL statements.

repartitionDS([<select * from pt where id between 0:50>,<select * from pt where id between 51:100>]);
// output
[DataSource< select [4] * from pt where id between 0 : 50 >,DataSource< select [4] * from pt where id between 51 : 100 >]