delete

Delete existing records in a table.

It can be used not only for an in-memory table, but also for a dfs table(a distributed table or a dimension table). Please note that if the data in a dfs table needs to be deleted, the whole partition which includes the data will be updated after deletion; If the data in a dimension table needs to be updated, the whole table will be updated after deletion. Therefore, the operation to delete a dfs table is only suitable for low-frequency deletion tasks such as minute-level deletion tasks, and not suitable for high-frequency deletion tasks, such as millisecond-level deletion tasks.

The deletion takes a multi-version approach and supports transactions. The system will create a new version to store the new data. Before committing the transaction, other SQL statements still can access the data in the old version. If data in multiple partitions needs to be deleted, as long as the deletion on one partitions fails, the system will roll back all modifications on each partition.

The delete clause supports the map sub-clause in version 1.30.13 and above, which means that the deletion operation will be be executed in each partition separately. If functions such as isDuplicated, first, firstNot, etc. whose results are sensitive to the order of the rows can be used in the where clause of a SQL statement on multiple partitions only if the map clause is used.

Syntax

(1) Delete records from a table

 delete from table_name
    [where condition(s)];

If where conditions are not used, the SQL statement deletes all records from the table.

(2) Delete records from a table, based on conditions involving another table.

delete table_name 
  from table_joiner(table_names)
  [where condition(s)];

where

  • table_joiner can only be ej or lj.

  • table_name must be the left table.

If where conditions are not used, the SQL statement deletes all matched records from the left table.

Note:

  • For nested joins, only the outermost join can involve DFS tables.

  • When joining an in-memory table (left) with a DFS table (right), the right table is loaded into memory.

  • If chunkGranularity of a database is set to "CHUNK", a partitioned table of the database can only be the left table for joining.

  • Joining tables from two different databases is not supported.

Examples

t = table(1 1 1 2 2 2 3 3 3 3 as id, 1..10 as x);
t;
id x
1 1
1 2
1 3
2 4
2 5
2 6
3 7
3 8
3 9
3 10
delete from t where id=1;
t;
id x
2 4
2 5
2 6
3 7
3 8
3 9
3 10
delete from t where id=3, x>8;
t;
id x
2 4
2 5
2 6
3 7
3 8
delete from t;
t;
id x

Example 2. Delete the data in a dfs table

login(`admin, `123456)
n=1000000
ID=rand(10, n)
x=rand(1.0, n)
t=table(ID, x)
db=database("dfs://rangedb124", RANGE,  0 5 10)
pt=db.createPartitionedTable(t, `pt, `ID)$ pt.append!(t)
select count(*) from pt;1000000
delete from pt where ID=5;
select count(*) from pt;

Example 3. To delete data in distributed table, and the result of the function in where clause is sensitive to the order of the rows, the map clause must be used which mean the operation will be executed separately in the partition.

n=10000$ ID=take( 0..4, n)
date=take(2017.08.07..2017.11.11, n)
ts=rand(timestamp(1..n),n)
int=take(1..50,n)
str=rand(string(1..n),n)
sym=rand(symbol(string(1..n)),n)
x=rand(10.0, n)
t=table(ID, date,ts,int,str,sym, x)
if(existsDatabase("dfs://compoDB")){
dropDatabase("dfs://compoDB")
}
db = database("dfs://compoDB", VALUE,2017.08M..2017.10M)
pt = db.createPartitionedTable(t, `pt, `date)$ pt.append!(t);
delete from pt where isDuplicated([ID,int])=false map;

Example 4. Using delete with join

t1 = table(1..5 as id, [1,2,2,1,1] as flag)
id flag
1 1
2 2
3 2
4 1
5 1
t2 = table(3..7 as id, [100,200,100,150,100] as profit)
id profit
3 100
4 200
5 100
6 150
7 100

Delete records from t2 that have a matching id in both t1 and t2, where the flag in t1 is equal to 1.

delete t2 from ej(t2, t1, `id) where flag=1 
t2
id profit
3 100
6 150
7 100