# DStream::snapshotJoinEngine {#DStream_snapshotJoinEngine}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`DStream::snapshotJoinEngine(rightStream, metrics, matchingColumn, [timeColumn], [keepLeftDuplicates=false], [keepRightDuplicates=false], [isInnerJoin=true])`

## Details {#details}

Creates a snapshot join streaming engine. For details, see [createSnapshotJoinEngine](../c/createSnapshotJoinEngine.md).

**Return value**: A DStream object.

## Arguments {#arguments}

**rightStream** is a DStream object indicating the input data source of the right table.

**metrics** is metacode \(can be a tuple\) specifying the calculation formulas. For more information about metacode, refer to [Metaprogramming](../c/../../Programming/Metaprogramming/metaprogramming.md).

-   *metrics* can use one or more expressions, built-in or user-defined functions \(but not aggregate functions\).
-   *metrics* can be functions that return multiple values and the columns in the output table to hold the return values must be specified. For example, `<func(price) as `col1`col2>`.

To specify a column that exists in both the left and the right tables, use the format *tableName.colName*. By default, the column from the left table is used.

**Note:** The column names specified in *metrics* are not case-sensitive and can be inconsistent with the column names of the input tables.

**matchingColumn** is a STRING scalar/vector/tuple indicating the column\(s\) on which the tables are joined. It supports integral, temporal or literal \(except UUID\) types.

-   When there is only 1 column to match - If the names of the columns to match are the same in both tables, *matchingColumn* should be specified as a STRING scalar; otherwise it's a tuple of two elements. For example, if the column is named "sym" in the left table and "sym1" in the right table, then *matchingColumn* = \[\[\`sym\],\[\`sym1\]\].
-   When there are multiple columns to match - If both tables share the names of all columns to match, *matchingColumn* is a STRING vector; otherwise it's a tuple of two elements. For example, if the columns are named "timestamp" and "sym" in the left table, whereas in the right table they're named "timestamp" and "sym1", then *matchingColumn* = \[\[\`timestamp, \`sym\], \[\`timestamp,\`sym1\]\].

**timeColumn** \(optional\) is a STRING scalar/vector indicating the name of the time column in the left table and the right table. The time columns must have the same data type. If the names of the time column in the left table and the right table are the same, *timeColumn* is a string. Otherwise, it is a vector of 2 strings indicating the time column in each table.

**keepLeftDuplicates** \(optional\) is a Boolean value indicating whether to match all records in each group of the left table. When set to false \(default\), the engine matches only the latest record in each group. When set to true, the engine matches all records in each group.

**keepRightDuplicates** \(optional\) is a Boolean value indicating whether to match all records in each group of the right table. When set to false \(default\), the engine matches the latest record in each group. When set to true, the engine matches all records in each group.

**isInnerJoin** \(optional\) is a Boolean value to determine whether an inner join or full outer join is performed.

-   If *isInnerJoin*=true \(default\), an inner join is performed. Results are only generated when matches are found between both tables.
-   If *isInnerJoin*=false, an outer join is performed. Results are generated whether or not a match is found. If there are unmatched records, entries from the other table are null padded.

## Examples {#examples}

``` {#codeblock_zq1_tms_c2c}
if (!existsCatalog("orca")) {
	createCatalog("orca")
}
go
use catalog orca


// If a stream graph with the same name already exists, destroy it first.
// dropStreamGraph('joinEngine')

g = createStreamGraph('joinEngine')
r = g.source("right", 1024:0, `timestamp`sym2`id`price`qty, [TIMESTAMP, SYMBOL, INT, DOUBLE, DOUBLE])
g.source("left", 1024:0, `timestamp`sym1`id`price`val, [TIMESTAMP, SYMBOL, INT, DOUBLE, DOUBLE])
    .snapshotJoinEngine(r, metrics=[<val*10>, <qty>], matchingColumn = [["id","sym1"],["id","sym2"]], 
timeColumn = `timestamp, isInnerJoin=true, keepLeftDuplicates=true,keepRightDuplicates=true)
    .sink("output")
g.submit()
go

timestamp = 2024.10.10T15:12:01.507+1..10
sym = take(["a","b","c","d"],10)
id = [1,1,2,1,5,2,4,4,1,4]
price = [2.53,7.61,8.07,7.87,7.29,9.39,5.98,9.49,9.20,9.17]
val = [101,108,101,109,104,100,108,100,107,104]
tmp1 = table(timestamp as timestamp,sym as sym1,id as id,price as price,val as val)
appendOrcaStreamTable("left", tmp1)

id = [1,2,4,3,5,5,4,2,5,5]
price =  [1.08,9.08,9.97,7.60,1.91,6.77,7.81,8.81,0.61,5.92]
qty =  [208,200,203,202,204,201,206,207,205,205]
tmp2 = table(timestamp as timestamp,sym as sym2,id as id,price as price,qty as qty)
appendOrcaStreamTable("right", tmp2)

select * from orca_table.output
```

|id|sym1|timestamp|right\_timestamp|val\_mul|qty|
|---|----|---------|----------------|--------|---|
|0|a|2024.10.10 15:12:01.508|2024.10.10 15:12:01.508|1,010|208|
|1|a|2024.10.10 15:12:01.512|2024.10.10 15:12:01.512|1,040|204|
|2|a|2024.10.10 15:12:01.512|2024.10.10 15:12:01.516|1,040|205|
|3|b|2024.10.10 15:12:01.513|2024.10.10 15:12:01.509|1,000|200|
|4|c|2024.10.10 15:12:01.514|2024.10.10 15:12:01.510|1,080|203|
|5|c|2024.10.10 15:12:01.514|2024.10.10 15:12:01.514|1,080|206|
|6|a|2024.10.10 15:12:01.516|2024.10.10 15:12:01.508|1,070|208|

