latestKeyedStreamTable
Syntax
latestKeyedStreamTable(keyColumns, timeColumn, [X1], [X2],
.....)
or
latestKeyedStreamTable(keyColumns, timeColumn, capacity:size, colNames,
colTypes)
Arguments
keyColumns is a string scalar or vector indicating the names of the primary key columns. The primary key columns must be of INTEGRAL, TEMPORAL, LITERAL, or FLOATING types.
latestKeyedStreamTable
has an additional parameter
timeColumncompared to the function
keyedStreamTable.
timeColumn specifies the time column(s) and can be either:
- A string indicates a single column of integral or temporal type, or
- A two-element vector indicates two columns that combine to form a unique timestamp: a DATE column and a TIME, SECOND, or NANOTIME column.
For the first scenario: X, X1, .... are vectors.
For the second scenario:
capacity is the amount of memory (in terms of the number of rows) allocated to the table. When the number of rows exceeds capacity, the system will first allocate memory of 1.2-2 times of capacity, copy the data to the new memory space, and release the original memory. For large tables, these steps may use significant amount of memory.
size can be 0 or 1, indicating the initial size (in terms of the number of rows) of the table. If size=0, create an empty table; If size = 1, create a table with one record, and the initialized values are:
-
false for Boolean type;
-
0 for numeric, temporal, IPADDR, COMPLEX, and POINT types;
-
NULL for Literal, INT128 types.
colNames is a string vector of column names.
colTypes is a string vector of data types. As of 2.00.11.2, the non-key columns can be specified as array vectors.
Details
Create a keyed stream table with one or more columns serving as the primary key.
Compared to the keyedStreamTable
,
latestKeyedStreamTable
maintains the most up-to-date record for
each unique primary key based on a time column. When a new record arrives, the
system compares its primary key to existing records in memory:
- If a match is found, check the timestamps:
- If the new record's timestamp is more recent, it is inserted and replace the existing record.
- If not, the existing record remains unchanged.
- If no matching primary key is found, add the new record to the table. In cases where multiple new records with the same key are written simultaneously, only the record with the most recent timestamp is inserted.
Examples
Example 1: Specifying a single primary key column.
Method 1:
id = `A`B`C`D`E
x = 1 2 3 4 5
timeCol = 2024.09.10T00:00:00.001+0..4
t1 = latestKeyedStreamTable(`id, `timeCol, id, x, timeCol)
t1;
Output:
id | x | timeCol |
---|---|---|
A | 1 | 2024.09.10T00:00:00.001 |
B | 2 | 2024.09.10T00:00:00.002 |
C | 3 | 2024.09.10T00:00:00.003 |
D | 4 | 2024.09.10T00:00:00.004 |
E | 5 | 2024.09.10T00:00:00.005 |
Method 2:
t2=latestKeyedStreamTable(`id, `timeCol, 100:0, `id`x`timeCol, [INT,INT,TIMESTAMP])
insert into t2 values(1 2 3, 10 20 30, [2024.09.10T00:00:00.001,
2024.09.10T00:00:00.002, 2024.09.10T00:00:00.003])
t2
Output:
id | x | timeCol |
---|---|---|
1 | 10 | 2024.09.10T00:00:00.001 |
2 | 20 | 2024.09.10T00:00:00.002 |
3 | 30 | 2024.09.10T00:00:00.003 |
Insert new rows to table t1.
insert into t1 values(`D`E`F, 6 7 8, [2024.09.10T00:00:00.005,
2024.09.10T00:00:00.005, 2024.09.10T00:00:00.005])
t1
Output:
id | x | timeCol |
---|---|---|
A | 1 | 2024.09.10T00:00:00.001 |
B | 2 | 2024.09.10T00:00:00.002 |
C | 3 | 2024.09.10T00:00:00.003 |
D | 4 | 2024.09.10T00:00:00.004 |
E | 5 | 2024.09.10T00:00:00.005 |
D | 6 | 2024.09.10T00:00:00.005 |
F | 8 | 2024.09.10T00:00:00.005 |
We can see from the output table that new rows are inserted into table t1 based on their id and timestamp. Row with id=D is inserted due to its newer timestamp than the existing D record. Row with id=E is discarded because it has the same timestamp as the existing E record. Row with id=F is inserted since there's no existing F record in the table.
Example 2: Specifying multiple primary key columns.
t3 = latestKeyedStreamTable(`id`x, `timeCol, id, x, timeCol)
insert into t3 values(`D`E, 4 5, [2024.09.10T00:00:00.004, 2024.09.10T00:00:00.005])
insert into t3 values(`D`F, 6 7, [2024.09.10T00:00:00.004, 2024.09.10T00:00:00.005])
t3
Output:
id | x | timeCol |
---|---|---|
A | 1 | 2024.09.10T00:00:00.001 |
B | 2 | 2024.09.10T00:00:00.002 |
C | 3 | 2024.09.10T00:00:00.003 |
D | 4 | 2024.09.10T00:00:00.004 |
E | 5 | 2024.09.10T00:00:00.005 |
D | 6 | 2024.09.10T00:00:00.004 |
F | 7 | 2024.09.10T00:00:00.005 |
Example 3: Specifying two time columns.
id = `A`B`C`D`E
dateCol = take(2024.09.10, 5)
timeCol = 00:00:00.001+0..4
t4 = latestKeyedStreamTable(`id, `dateCol`timeCol, id, x, dateCol, timeCol)
t4
Output:
id | x | dateCol | timeCol |
---|---|---|---|
A | 1 | 2024.09.10 | 00:00:00.001 |
B | 2 | 2024.09.10 | 00:00:00.002 |
C | 3 | 2024.09.10 | 00:00:00.003 |
D | 4 | 2024.09.10 | 00:00:00.004 |
E | 5 | 2024.09.10 | 00:00:00.005 |
Insert new rows to table t4 based on their id and timestamp. The timestamp is created
by combining dateCol and timeCol, i.e.,concatDateTime(dateCol,
timeCol)
.
insert into t4 values(`D`E, 4 5, [2024.09.10, 2024.09.11], [00:00:00.004, 00:00:00.005]);
t4
Output:
id | x | dateCol | timeCol |
---|---|---|---|
A | 1 | 2024.09.10 | 00:00:00.001 |
B | 2 | 2024.09.10 | 00:00:00.002 |
C | 3 | 2024.09.10 | 00:00:00.003 |
D | 4 | 2024.09.10 | 00:00:00.004 |
E | 5 | 2024.09.10 | 00:00:00.005 |
E | 5 | 2024.09.11 | 00:00:00.005 |