t1 = table(2015.01.01+(0 31 59 90 120) as date, 1.2 7.8 4.6 5.1 9.5 as value)
t2 = table(2015.02.01+(0 15 89 89) as date, 1..4 as qty);
t1;
date |
value |
2015.01.01 |
1.2 |
2015.02.01 |
7.8 |
2015.03.01 |
4.6 |
2015.04.01 |
5.1 |
2015.05.01 |
9.5 |
t2;
date |
qty |
2015.02.01 |
1 |
2015.02.16 |
2 |
2015.05.01 |
3 |
2015.05.01 |
4 |
select * from lsj(t1, t2, `date);
date |
value |
qty |
2015.01.01 |
1.2 |
|
2015.02.01 |
7.8 |
1 |
2015.03.01 |
4.6 |
|
2015.04.01 |
5.1 |
|
2015.05.01 |
9.5 |
3 |
select * from aj(t1, t2, `date);
date |
value |
t2_date |
qty |
2015.01.01 |
1.2 |
|
|
2015.02.01 |
7.8 |
2015.02.01 |
1 |
2015.03.01 |
4.6 |
2015.02.16 |
2 |
2015.04.01 |
5.1 |
2015.02.16 |
2 |
2015.05.01 |
9.5 |
2015.05.01 |
4 |
select * from aj(t1, t2, `date) where t1.date>=2015.03.01;
date |
value |
t2_date |
qty |
2015.03.01 |
4.6 |
2015.02.16 |
2 |
2015.04.01 |
5.1 |
2015.02.16 |
2 |
2015.05.01 |
9.5 |
2015.05.01 |
4 |
A common usage of asof join is to join on the time field to retrieve the latest
information. Suppose we have the following 3 tables, where the data have all been
sorted on the column minute .
minute = 09:30m 09:32m 09:33m 09:35m
price = 174.1 175.2 174.8 175.2
t1 = table(minute, price)
minute = 09:30m 09:31m 09:33m 09:34m
price = 29.2 28.9 29.3 30.1
t2 = table(minute, price)
minute =09:30m 09:31m 09:34m 09:36m
price = 51.2 52.4 51.9 52.8
t3 = table(minute, price);
t1;
minute |
price |
09:30m |
174.1 |
09:32m |
175.2 |
09:33m |
174.8 |
09:35m |
175.2 |
t2;
minute |
price |
09:30m |
29.2 |
09:31m |
28.9 |
09:33m |
29.3 |
09:34m |
30.1 |
t3;
minute |
price |
09:30m |
51.2 |
09:31m |
52.4 |
09:34m |
51.9 |
09:36m |
52.8 |
t2 = aj(t2, t3, `minute);
t2;
minute |
price |
t3_minute |
t3_price |
09:30m |
29.2 |
09:30m |
51.2 |
09:31m |
28.9 |
09:31m |
52.4 |
09:33m |
29.3 |
09:31m |
52.4 |
09:34m |
30.1 |
09:34m |
51.9 |
aj(t1, t2, `minute);
minute |
price |
t2_minute |
t2_price |
t3_minute |
t3_price |
09:30m |
174.1 |
09:30m |
29.2 |
09:30m |
51.2 |
09:32m |
175.2 |
09:31m |
28.9 |
09:31m |
52.4 |
09:33m |
174.8 |
09:33m |
29.3 |
09:31m |
52.4 |
09:35m |
175.2 |
09:34m |
30.1 |
09:34m |
51.9 |
Note that there are no matching records for t2 and t3 at 09:32m, so the prices at
09:31m from table t2 and t3, which are the latest, are applied. The price from table
t3 at 09:33m and the prices from table t2 and t3 at 09:35m also use the latest
prices to replace the missing records. This is extremely useful when we populate
data for certain time points. For example, certain information is only updated
weekly or monthly. In designing a daily trading strategy, we can use asof join to
populate the daily datasets with less frequently updated information.
The last joining column is of type UUID:
t1 = table(2015.01.01 2015.02.01 2015.03.01 2015.04.01 2015.05.01 as date, uuid(["5d212a78-cc48-e3b1-4235-b4d91473ee81", "5d212a78-cc48-e3b1-4235-b4d91473ee83", "5d212a78-cc48-e3b1-4235-b4d91473ee85", "5d212a78-cc48-e3b1-4235-b4d91473ee87", "5d212a78-cc48-e3b1-4235-b4d91473ee89"]) as uid)
t2 = table(2015.01.15 2015.01.20 2015.01.25 2015.03.01 as date,uuid(["5d212a78-cc48-e3b1-4235-b4d91473ee81", "5d212a78-cc48-e3b1-4235-b4d91473ee83", "5d212a78-cc48-e3b1-4235-b4d91473ee85", "5d212a78-cc48-e3b1-4235-b4d91473ee87"]) as uid)
select * from aj(t1, t2, `uid);
date |
uid |
t2_date |
t2_uid |
2015.01.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee81 |
2015.01.15 |
5d212a78-cc48-e3b1-4235-b4d91473ee81 |
2015.02.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee83 |
2015.01.20 |
5d212a78-cc48-e3b1-4235-b4d91473ee83 |
2015.03.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee85 |
2015.01.25 |
5d212a78-cc48-e3b1-4235-b4d91473ee85 |
2015.04.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee87 |
2015.03.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee87 |
2015.05.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee89 |
2015.03.01 |
5d212a78-cc48-e3b1-4235-b4d91473ee87 |