N to N Multi-Table Replay

The N to N type replays multiple tables to target tables. Each of the output tables corresponds to an input table and has the same schema as the corresponding input table.

Example 1

The following example demonstrates how to replay multiple tables to target tables at different rates.

  • Create input and output tables for replaying and writing simulated data to the input tables:
    // Create the input table t1 and insert simulated data
    n=50000
    sym = rand(symbol(`IBM`APPL`MSFT`GOOG`GS),n)
    date = take(2012.06.12..2012.06.16,n)
    time = take(13:00:00.000..16:59:59.999,n)
    volume = rand(100,n)
    t1 = table(sym,date,time,volume).sortBy!([`date, `time])
    // Create the input table t2 and insert simulated data
    sym = rand(symbol(`IBM`APPL`MSFT`GOOG`GS),n)
    date = take(2012.06.12..2012.06.16,n)
    time = take(13:00:00.000..16:59:59.999,n)
    price = 100 + rand(10.0,n)
    t2 = table(sym,date,time,price).sortBy!([`date, `time])
    // Create output tables outTable1 and outTable2
    share streamTable(100:0,`sym`date`time`volume,[SYMBOL,DATE,TIME,INT]) as outTable1
    share streamTable(100:0,`sym`date`time`price,[SYMBOL,DATE,TIME,DOUBLE]) as outTable2
  • Replay 10,000 records per second. For 100,000 records in both input tables, it takes about 10 seconds.
    timer replay(inputTables=trades, outputTables=st, dateColumn=`date, timeColumn=`time,replayRate=100, absoluteRate=true)
    // Time elapsed: 10001.807 ms
  • Replay at 10,000 times the time span of the data. The difference between the start timestamp and the end timestamp in both input tables is 345,650 seconds, and it takes about 3.5 seconds to replay the table.
    timer replay(inputTables=[t1,t2], outputTables=[outTable1, outTable2], dateColumn=`date, timeColumn=`time,replayRate=10000,absoluteRate=false)
    // Time elapsed: 3484.047 ms
  • Replay at the maximum speed:
    timer replay(inputTables=[t1,t2], outputTables=[outTable1, outTable2], dateColumn=`date, timeColumn=`time)
    // Time elapsed: 4.441 ms

Example 2

The following example uses the replayDS function to replay data in a DFS table.

  • Write the input tables to a database:
    if(existsDatabase("dfs://test_stock")){
    dropDatabase("dfs://test_stock")
    }
    db=database("dfs://test_stock",VALUE,2012.06.12..2012.06.16)
    pt1=db.createPartitionedTable(t1,`pt1,`date).append!(t1)
    pt2=db.createPartitionedTable(t2,`pt2,`date).append!(t2)
  • Use the replayDS function to split the data source:
    ds1=replayDS(sqlObj=<select sym, date, time, volume from pt1>,dateColumn=`date,timeColumn=`time,timeRepartitionSchema=[13:00:00.000, 14:00:00.000, 15:00:00.000, 16:00:00.000, 17:00:00.000])
    ds2=replayDS(sqlObj=<select sym, date, time , price from pt2>,dateColumn=`date,timeColumn=`time,timeRepartitionSchema=[13:00:00.000, 14:00:00.000, 15:00:00.000, 16:00:00.000, 17:00:00.000])
    // View the number of data sources split by ds1 (which equals the number split by ds2)
    ds1.size()
    // output: 30
  • Use the replay function to replay the split data sources at the maximum speed:
    timer replay(inputTables=[ds1,ds2], outputTables=[outTable1, outTable2], dateColumn=`date, timeColumn=`time)
    // Time elapsed: 450.956 ms