dropPartition

Syntax

dropPartition(dbHandle, partitionPaths, tableName, [forceDelete=false], [deleteSchema=false])

Arguments

dbHandle is a DolphinDB database handle.

partitionPaths can be specified in two ways:
  • By path: partitionPaths is a STRING scalar/vector indicating the path of one or multiple partitions. Each string must start with "/". For composite partitions, the path must include all partition levels.

  • By condition: partitionPaths is a scalar or vector indicating the value(s) in the partitioning column. The system will drop all partitions containing these values. For composite partitions, partitionPaths is a tuple where each element is a filtering condition for each partition level (starting from the first level). If you do not want to apply filtering at a certain partition level, leave the corresponding element empty.

tableName is a string indicating a table name. It can be left empty if the database chunk granularity is at DATABASE level (i.e., database: chunkGranularity = 'DATABASE'). Otherwise, it is a required parameter.

forceDelete (optional) is a Boolean value. If set to true, the specified partition(s) will be deleted even if the partition(s) is recovering. The default value is false.

Note: When using the dropPartition function with forceDelete=false, the number of available replicas for the chunks involved in the transaction must be greater than or equal to the configured dfsReplicationFactor.
deleteSchema (optional) is a Boolean value. The default value is false, indicating that only the data in the selected partitions will be deleted, but the partition schema (which you can check with schema().partitionSchema) is kept. When the following conditions are satisfied, you can delete the schema of the selected partitions along with the partition data by setting deleteSchema to true:
  • There's only one table in the database.
  • The partitioning type of the database is VALUE.
  • For composite partitions, the first level of partitioning type must be VALUE, and only the first level of partitions are selected for deletion.

Details

Delete data from one or multiple partitions from a DFS database.

Please note that dropPartition only deletes data from selected partitions. It does not change the partitioning scheme. We do not need to reestablish these partitions if we need to append new data to them.

If tableName is specified: delete one or multiple partitions of the given table.

If tableName is not specified: delete one or multiple partitions of all tables with this partition.

Examples

The script should be executed on a data node or compute node of a cluster.

n=1000000
ID=rand(150, n)
dates=2017.08.07..2017.08.11
date=rand(dates, n)
x=rand(10.0, n)
t=table(ID, date, x)
dbDate = database(, VALUE, 2017.08.07..2017.08.11)
dbID = database(, RANGE, 0 50 100 150)
db = database("dfs://compoDB", COMPO, [dbDate, dbID])
pt = db.createPartitionedTable(t, `pt, `date`ID)
pt.append!(t);

The script above created a database with composite partition. The first level is a value partition with partitioning column of date, and the second level is a range partition with partitioning column of ID.

Example 1. Delete one partition

Use either of the following ways to delete the partition "/20170807/0_50".

(1) Specify the partition path.

dropPartition(db,"/20170807/0_50");

(2) Specify the filtering condition.

dropPartition(db,[2017.08.07, 0]);

Here 0 means the partition of [0, 50). We can choose any number from 0 to 49 to represent this partition.

Example 2. Delete a first level partition

Use either of the following ways to delete the first level partition of 2017.08.08.

(1) Specify the path of all partitions under 2017.08.08.

partitions=["/20170808/0_50","/20170808/50_100","/20170808/100_150"]
dropPartition(db,partitions);

(2) Specify the filtering condition.

dropPartition(db, 2017.08.08, tableName=`pt);

After the deletion, check the table partitioning scheme with schema

schema(db);
// output
partitionSchema->([2017.08.11,2017.08.10,2017.08.09,2017.08.08,2017.08.07],[0,50,100,150])
partitionSites->
partitionTypeName->[VALUE,RANGE]
atomic->TRANS
databaseDir->dfs://compoDB
chunkGranularity->TABLE
engineType->OLAP
partitionType->[1,2]

We can see that "2017.08.08" is still in partitionSchema as dropPartition only deleted the data from this partition but kept its schema.

Example 3. Delete a second level partition

Use either of the following ways to delete the second level partition of [0,50).

(1) Specify the path of all partitions of [0,50).

partitions=["/20170807/0_50","/20170808/0_50","/20170809/0_50","/20170810/0_50","/20170811/0_50"]
dropPartition(db,partitions);

(2) Specify the filtering condition.

dropPartition(db,[,[0]]);

Example 4. Delete multiple same level partitions

To delete the second level partitions of [0,50) and [100,150):

dropPartition(db,[,[0,100]]);

Example 5: Modify data in a distributed table on disk

To revise data in a distributed table, we need to update the entire partitions that the rows to be updated belong to. The following example adds 10 to column x of the rows with date=2017.08.10 and ID=88 in the distributed table pt.

(1) First load the data of the partition with date=2017.08.10 and ID=88 into memory.

tmp=select * from loadTable("dfs://compoDB","pt") where date=2017.08.10 and 50<=ID<100 ;

(2) Then add 10 to column x of table tmp:

update tmp set x=x+10 where date=2017.08.10 and ID=88;

(3) Delete data in the relevant partition:

dropPartition(db,"/20170810/50_100",`pt);

(4) Append table tmp to table pt:

pt.append!(tmp);