latestKeyedStreamTable

Syntax

latestKeyedStreamTable(keyColumns, timeColumn, [X1], [X2], .....)

or

latestKeyedStreamTable(keyColumns, timeColumn, capacity:size, colNames, colTypes)

Arguments

keyColumns is a string scalar or vector indicating the names of the primary key columns. The primary key columns must be of INTEGRAL, TEMPORAL, LITERAL, or FLOATING types.

latestKeyedStreamTable has an additional parameter timeColumncompared to the function keyedStreamTable.

timeColumn specifies the time column(s) and can be either:

  • A string indicates a single column of integral or temporal type, or
  • A two-element vector indicates two columns that combine to form a unique timestamp: a DATE column and a TIME, SECOND, or NANOTIME column.

For the first scenario: X, X1, .... are vectors.

For the second scenario:

capacity is the amount of memory (in terms of the number of rows) allocated to the table. When the number of rows exceeds capacity, the system will first allocate memory of 1.2-2 times of capacity, copy the data to the new memory space, and release the original memory. For large tables, these steps may use significant amount of memory.

size can be 0 or 1, indicating the initial size (in terms of the number of rows) of the table. If size=0, create an empty table; If size = 1, create a table with one record, and the initialized values are:

  • false for Boolean type;

  • 0 for numeric, temporal, IPADDR, COMPLEX, and POINT types;

  • NULL for Literal, INT128 types.

colNames is a string vector of column names.

colTypes is a string vector of data types. As of 2.00.11.2, the non-key columns can be specified as array vectors.

Details

Create a keyed stream table with one or more columns serving as the primary key. Compared to the keyedStreamTable, latestKeyedStreamTablemaintains the most up-to-date record for each unique primary key based on a time column. When a new record arrives, the system compares its primary key to existing records in memory:

  • If a match is found, check the timestamps:
    • If the new record's timestamp is more recent, it is inserted and replace the existing record.
    • If not, the existing record remains unchanged.
  • If no matching primary key is found, add the new record to the table. In cases where multiple new records with the same key are written simultaneously, only the record with the most recent timestamp is inserted.

Examples

Example 1: Specifying a single primary key column.

Method 1:

id = `A`B`C`D`E
x = 1 2 3 4 5
timeCol = 2024.09.10T00:00:00.001+0..4
t1 = latestKeyedStreamTable(`id, `timeCol, id, x, timeCol)
t1;

Output:

id x timeCol
A 1 2024.09.10T00:00:00.001
B 2 2024.09.10T00:00:00.002
C 3 2024.09.10T00:00:00.003
D 4 2024.09.10T00:00:00.004
E 5 2024.09.10T00:00:00.005

Method 2:

t2=latestKeyedStreamTable(`id, `timeCol, 100:0, `id`x`timeCol, [INT,INT,TIMESTAMP])
insert into t2 values(1 2 3, 10 20 30, [2024.09.10T00:00:00.001,
2024.09.10T00:00:00.002, 2024.09.10T00:00:00.003])

t2

Output:

id x timeCol
1 10 2024.09.10T00:00:00.001
2 20 2024.09.10T00:00:00.002
3 30 2024.09.10T00:00:00.003

Insert new rows to table t1.

insert into t1 values(`D`E`F, 6 7 8, [2024.09.10T00:00:00.005,
 2024.09.10T00:00:00.005, 2024.09.10T00:00:00.005])
t1

Output:

id x timeCol
A 1 2024.09.10T00:00:00.001
B 2 2024.09.10T00:00:00.002
C 3 2024.09.10T00:00:00.003
D 4 2024.09.10T00:00:00.004
E 5 2024.09.10T00:00:00.005
D 6 2024.09.10T00:00:00.005
F 8 2024.09.10T00:00:00.005

We can see from the output table that new rows are inserted into table t1 based on their id and timestamp. Row with id=D is inserted due to its newer timestamp than the existing D record. Row with id=E is discarded because it has the same timestamp as the existing E record. Row with id=F is inserted since there's no existing F record in the table.

Example 2: Specifying multiple primary key columns.

t3 = latestKeyedStreamTable(`id`x, `timeCol, id, x, timeCol)
insert into t3 values(`D`E, 4 5, [2024.09.10T00:00:00.004, 2024.09.10T00:00:00.005])
insert into t3 values(`D`F, 6 7, [2024.09.10T00:00:00.004, 2024.09.10T00:00:00.005])
t3

Output:

id x timeCol
A 1 2024.09.10T00:00:00.001
B 2 2024.09.10T00:00:00.002
C 3 2024.09.10T00:00:00.003
D 4 2024.09.10T00:00:00.004
E 5 2024.09.10T00:00:00.005
D 6 2024.09.10T00:00:00.004
F 7 2024.09.10T00:00:00.005

Example 3: Specifying two time columns.

id = `A`B`C`D`E
dateCol = take(2024.09.10, 5)
timeCol = 00:00:00.001+0..4
t4 = latestKeyedStreamTable(`id, `dateCol`timeCol, id, x, dateCol, timeCol)
t4

Output:

id x dateCol timeCol
A 1 2024.09.10 00:00:00.001
B 2 2024.09.10 00:00:00.002
C 3 2024.09.10 00:00:00.003
D 4 2024.09.10 00:00:00.004
E 5 2024.09.10 00:00:00.005

Insert new rows to table t4 based on their id and timestamp. The timestamp is created by combining dateCol and timeCol, i.e.,concatDateTime(dateCol, timeCol).

insert into t4 values(`D`E, 4 5, [2024.09.10,  2024.09.11], [00:00:00.004, 00:00:00.005]);
t4

Output:

id x dateCol timeCol
A 1 2024.09.10 00:00:00.001
B 2 2024.09.10 00:00:00.002
C 3 2024.09.10 00:00:00.003
D 4 2024.09.10 00:00:00.004
E 5 2024.09.10 00:00:00.005
E 5 2024.09.11 00:00:00.005