isDuplicated

Syntax

isDuplicated(X, [keep=FIRST])

Details

Return a vector or a tuple of vectors of Boolean values. If an element has no duplicate values, it returns 0.

  • If keep=FIRST, the first duplicate value returns 0 while all other duplicate values return 1.

  • If keep=LAST, the last duplicate value returns 0 while all other duplicate values return 1.

  • If keep=NONE, all duplicate values return 1.

Parameters

X is a vector or a tuple of vectors of same length.

keep can take the value of FIRST, LAST or NONE. It indicates how the system processes duplicate values. The default value is FIRST.

Returns

  • If X is a vector, the result is a Boolean vector with the same length as X.
  • If X is a tuple, the result is a Boolean vector with the same length as each element in X.

Examples

v = [1,3,1,-6,NULL,2,NULL,1]
isDuplicated(v,FIRST);
// output: [false,false,true,false,false,false,true,true]
// 1 appears three time in v, the locations are the 0th, the 2nd and the 7th. Since keep = FIRST, the 0th result is set to false, and the 2nd and 7th results are set to true.

v = [1,3,1,-6,NULL,2,NULL,1]
isDuplicated(v,LAST);
// output: [true,false,true,false,true,false,false,false]
// 1 appears three time in v, the locations are the the 0th, the 2nd and the 7th. Since keep = LAST, the 7th result is set to false, and the 0th and 2nd results are set to true.

v = [1,3,1,-6,NULL,2,NULL,1]
isDuplicated(v,NONE);
// output: [true,false,true,false,true,false,true,true]
// 1 appears three time in v, the locations are the 0th, the 2nd and the 7th. Since keep = NONE, the 0th, the 2nd and the 7th result are all set to true.

To delete duplicate records from a table:

t=table(1 2 4 8 4 2 7 1 as id, 10 20 40 80 40 20 70 10 as val);
t;
id val
1 10
2 20
4 40
8 80
4 40
2 20
7 70
1 10
select * from t where isDuplicated([id,val],FIRST)=false;
// Only keep the first duplicated record and delete the others.
id val
1 10
2 20
4 40
8 80
7 70

Since version 2.00.13/3.00.1, the isDuplicated function supports the BLOB data.

a=[blob("s1"), blob("s2")]
isDuplicated(a)
// output: [false, false]
a1=[blob("s1"), blob("s2"), blob("s1"), blob("s2")]
isDuplicated(a1)
// output: [false, false, true, true]