keyedStreamTable
Syntax
keyedStreamTable(keyColumn, X, [X1], [X2], .....)
or
keyedStreamTable(keyColumn, capacity:size, colNames, colTypes)
Arguments
keyColumn is a string scalar or vector indicating the name of the primary key columns (which must be of INTEGRAL, TEMPORAL, LITERAL or FLOATING type).
- If the elements of Xk are vectors of equal length, each element of the tuple will be treated as a column in the table.
- If Xk contains elements of different types or unequal lengths, it will be treated as a single column in the table (with the column type set to ANY), and each element will correspond to the value of that column in each row.
- capacity is a positive integer indicating the amount of memory (in terms of the number of rows) allocated to the table. When the number of rows exceeds capacity, the system will first allocate memory of 1.2~2 times of capacity, copy the data to the new memory space, and release the original memory. For large tables, these steps may use significant amount of memory.
- size is an integer no less than 0 indicating the initial size (in
terms of the number of rows) of the table. If size=0, create an empty
table; If size>0, the initialized values are:
- false for Boolean type;
- 0 for numeric, temporal, IPADDR, COMPLEX, and POINT types;
- Null value for Literal, INT128 types.
-
Note: If colTypes is an array vector, size must be 0.
- colNames is a STRING vector of column names.
- colTypes is a string vector of data types. The non-key columns can be specified as an array vector type or ANY type.
Details
This function creates a stream table with one or more columns serving as the primary key. It implements idempotent writes to prevent duplicate primary key insertions due to network issues or high-availability writes.
When new records are inserted into a keyed stream table, the system checks the values of primary key.
- If the primary key of a new record is identical to an existing one in memory, the new record is not inserted, and the existing record remains unchanged.
- If multiple new records with the same primary key (different from those in memory) are written simultaneously, only the first record is successfully inserted.
Note: The uniqueness of the primary key is limited to data in memory. If persistence is enabled for the keyed stream table, a limited number of records are stored in memory, with older data being persisted to disk. The primary key of incoming data could potentially duplicate those on disk.
Examples
Example 1
id=`A`B`C`D`E
x=1 2 3 4 5
t1=keyedStreamTable(`id, id, x)
t1;
id | x |
---|---|
A | 1 |
B | 2 |
C | 3 |
D | 4 |
E | 5 |
Example 2
t2=keyedStreamTable(`id,100:0,`id`x, [INT,INT])
insert into t2 values(1 2 3,10 20 30);
t2;
id | x |
---|---|
1 | 10 |
2 | 20 |
3 | 30 |
If we try to insert a new row with duplicate primary key value as one of the existing rows, the new row will not be inserted:
insert into t2 values(3 4 5,35 45 55)
t2;
id | x |
---|---|
1 | 10 |
2 | 20 |
3 | 30 |
4 | 45 |
5 | 55 |
the record with id=3 has not been overwritten.
There are multiple columns in the primary key:
t=keyedStreamTable(`sym`id,1:0,`sym`id`val,[SYMBOL,INT,DOUBLE])
insert into t values(`A`B`C`D`E,5 4 3 2 1,52.1 64.2 25.5 48.8 71.9);
insert into t values(`A`B`R`T`Y,5 8 3 2 1,152.3 164.6 125.5 148.8 171.6);
t;
sym | id | val |
---|---|---|
A | 5 | 52.1 |
B | 4 | 64.2 |
C | 3 | 25.5 |
D | 2 | 48.8 |
E | 1 | 71.9 |
B | 8 | 164.6 |
R | 3 | 125.5 |
T | 2 | 148.8 |
Y | 1 | 171.6 |