Temporal Type Comparison

Temporal types are commonly used in databases, and comparisons between them are frequently required during data processing. However, the rules for comparing vary depending on the scenarios. This tutorial will introduce the following topics:

  • Temporal types
  • Temporal type conversion
  • Temporal type comparison
  • Partition pruning with temporal columns

1. Temporal Types

DolphinDB supports the following temporal types:

Category Data Type Format Example
Date DATE YYYY.MM.dd 2023.01.04
MONTH YYYY.MM.dd (end with “M”) 2023.01M
Time MINUTE HH:mm (end with “m”) 13:30m
SECOND HH:mm:ss 13:30:10
TIME HH:mm:ss.sss 13:30:10.008
NANOTIME HH:mm:ss.nnnnnnnnn 13:30:10.006007008
Datetime DATEHOUR YYYY.MM.ddTHH, YYYY.MM.dd HH

2023.01.04T21

Can only be obtained by function datehour.

DATETIME YYYY.MM.ddTHH:mm:ss, YYYY.MM.dd HH:mm:ss 2023.01.04T13:30:10, 2023.01.04 13:30:10
TIMESTAMP YYYY.MM.ddTHH:mm:ss.sss, YYYY.MM.dd HH:mm:ss.sss 2023.01.04T13:30:10.008, 2023.01.04 13:30:10.008
NANOTIMESTAMP YYYY.MM.ddTHH:mm:ss.sss.nnnnnnnnn, YYYY.MM.dd HH:mm:ss.sss.nnnnnnnnn 2023.01.04T13:30:10.001002003, 2023.01.04 13:30:10.001002003

Temporal values in DolphinDB do not include time zone information. It is up to the users to interpret the time zones of the temporal objects. Users can use localtime, gmtime, and convertTZ functions to handle time zone conversions. Refer to Time Zone and Conversion for the details.

Retrieve the current date with function today:

today()  // 2024.06.03

Retrieve the current timestamp with function now:

now()  // 2024.06.03T09:29:38.390
now(true)  // 2024.06.03T09:31:17.318298137

It returns a timestamp with millisecond precision by default. If you specify nanoSecond=true, it will return the result with nanosecond precision.

2. Temporal Type Conversion

Temporal types can be converted using specialized data type conversion functions or the function cast.

Temporal type casting support in DolphinDB:

From \ To Time Date Datetime
Time × ×
Date ×
Datetime

(1) Temporal types under the same category are convertible. For example:

DATE and MONTH.

date(2012.01M)   // 2012.01.01
month(2012.01.02)  // 2012.01M

MINUTE, SECOND, TIME and NANOTIME.

minute(23:30:00)  // 23:30m
minute(23:30:00.000)  // 23:30m
minute(23:30:00.000000000)  // 23:30m
second(23:30m)  // 23:30:00
second(23:30:00.001)  // 23:30:00
second(23:30:00.000000001)  // 23:30:00
time(23:31m)  // 23:31:00.000
time(23:30:01)  // 23:30:01.000
time(23:30:01.000000001)  // 23:30:01.000
nanotime(23:30m)  // 23:30:00.000000000
nanotime(23:30:31)  // 23:30:31.000000000
nanotime(23:30:31.001)  //23:30:31.001000000

DATEHOUR, DATETIME, TIMESTAMP and NANOTIMESTAMP.

datehour(2020.01.01 13:30:01)  // 2020.01.01T13
datehour(2020.01.01T13:30:01.001)  //  2020.01.01T13
datehour(2020.01.01T13:30:01.001002003)  // 2020.01.01T13
datetime(datehour(2020.01.01 13:00:01))  // 2020.01.01T13:00:00
datetime(2020.01.01T13:30:01.001)  // 2020.01.01T13:30:01
datetime(2020.01.01T13:30:01.001002003)  //  2020.01.01T13:30:01
timestamp(datehour(2020.01.01 13:00:01))  //  2020.01.01T13:00:00.000
timestamp(2020.01.01 13:00:01)  // 2020.01.01T13:00:01.000
timestamp(2020.01.01T13:30:01.001002003)  // 2020.01.01T13:30:01.001
nanotimestamp(datehour(2020.01.01 13:00:01))  //  2020.01.01T13:00:00.000000000
nanotimestamp(2020.01.01T13:30:01)  //  2020.01.01T13:30:01.000000000
nanotimestamp(2020.01.01T13:30:01.001)  // 2020.01.01T13:30:01.001000000

(2) Date and datetime data are convertible.

The system will add the information with time set to 0:00 when converting date data to datetime data.

datehour(2023.01.02)  // 2023.01.02T00
datetime(2023.01.02)  // 2023.01.02T00:00:00
timestamp(2023.01.02)  // 2023.01.02T00:00:00.000
nanotimestamp(2023.01.02)  // 2023.01.02T00:00:00.000000000
datehour(2023.01M)  // 2023.01.01T00
datetime(2023.01M)  // 2023.01.01T00:00:00
timestamp(2023.01M)  //  2023.01.01T00:00:00.000
nanotimestamp(2023.01M)  // 2023.01.01T00:00:00.000000000

The system will discard the information about time when converting datetime data to date data.

date(datehour(2020.01.01 13:00:01)) // 2020.01.01
date(2020.01.01 13:00:01)  // 2020.01.01
date(2020.01.01 13:00:01.001)  // 2020.01.01
date(2020.01.01 13:00:01.001002003)  // 2020.01.01
month(datehour(2020.01.01 13:00:01))  // 2020.01M
month(2020.01.01 13:00:01)  // 2020.01M
month(2020.01.01 13:00:01.001)  // 2020.01M
month(2020.01.01 13:00:01.001002003)  // 2020.01M

(3) Datetime data are convertible to time data.

The system will discard information about time during the conversion.

time(2020.01.01 13:00:01.001002003)  // 13:00:01.001
minute(2020.01.01 13:00:01)  // 13:00m

However, time data cannot be converted to datetime data.

datetime(13:00:01)  // Throw an exception

(4) Time data and date data are unconvertible.

month(13:00:01)  // Throw an exception
minute(2020.01.01)  // The function minute does not support date data

3. Temporal Type Comparison

Temporal values can be compared using comparison operators (>, <, >=, <=, ==, !=) and logical operators (between, in).

When comparing data of different temporal types, DolphinDB attempts to convert the temporal type with a lower precision to the higher one, following the rules in Chapter 2. If the conversion cannot be done, the system will throw an exception.

Note​​: Temporal precision from lowest to highest: MONTH < DATE < DATEHOUR < DATETIME < TIMESTAMP < NANOTIMESTAMP.​

For example, when executing 2023.01.04T13:30:10.001 > 2023.01.04, the system will convert 2023.01.04 to 2023.01.04T00:00:00.000 for comparison and return the result.

2023.01.04T13:30:10.001 > 2023.01.04   // true
2011.01.01T13:00:00 > 2011.01.02   // false
2023.01.04T13:30:10.001 == 2023.01.04   // false
2023.01.04 == 2023.01.04T00:00:00.000  //  true

Note:

  • MONTH values can only be compared with MONTH values.
  • Datetime values cannot be compared with time values even though the conversion is feasible.
  • Logical operator between requires both operands to be of the same type.
    2023.01.04T13:30:10.001 between 2023.01.04T13:30:10.003:2023.01.04T13:30:10.004   // false
    When the operands are of different types, the system will throw an exception:
    2023.01.04 between 2023.01.04T13:30:10.003:2023.01.04T13:30:10.004
    //  between(X, Y). Temporal data comparison should have the same data type.

4. Partition Pruning with Temporal Columns

Temporal columns are often used as partitioning columns in distributed databases. When queries filter on the partitioning columns, partition pruning will be triggered to reduce the partitions to be scanned, optimizing query performance.

In practical scenarios, queries are often performed either by directly filtering on the partitioning columns or by applying an explicit type conversion to the partitioning columns before filtering. The pruning rules differ slightly between these scenarios.

These terms are key to understanding the following pruning rules:

  • Partitioning scheme type: The data type of the parameter partitionScheme of function database. In the following example, the partitioning scheme is [2022.09.01, 2022.09.02, 2022.09.03] of DATE type.
    dbName = "dfs://time_comparison"
    if(existsDatabase(dbName))
    	dropDatabase(dbName)
    db = database(dbName, VALUE, [2022.09.01,2022.09.02, 2022.09.03])
  • Partitioning column type: The data type of parameter partitionColumns of function createPartitionedTable. In the following example, the partitioning column is column “time” of DATETIME type.
    n = 6
    t = table(n:n,[`time,`value],[DATETIME,DOUBLE])
    t[`time] = [2022.09.01T00:00:00, 2022.09.01T12:00:00, 2022.09.02T00:00:00, 2022.09.02T12:00:00, 2022.09.03T00:00:00, 2022.09.03T12:00:00]
    t[`value] = 1..6
    pt = db.createPartitionedTable(t, `pt, `time).append!(t)
  • Temporal object in the comparison. In this example, the partitioning column “time” is compared with 2022.09.01, where 2022.09.01 of DATE type is the temporal object in the comparison.
    select * from pt where time == 2022.09.01
    /*
    time                   value
    -----------------------------------
    2022.09.01T00:00:00    1.00000000
    2022.09.01T12:00:00    2.00000000
    */

4.1 Filtering on the Partitioning Columns

Temporal type comparison rules are the same for distributed queries and in-memory queries. Distributed queries filtering on the temporal partitioning columns can trigger partition pruning. Note that distributed tables cannot be created with the partitioning scheme specified as DATEHOUR or DATETIME types.

In the following example, the partitioning column is of DATETIME type and the data is partitioned by VALUE on a daily basis according to the column.

dbName = "dfs://time_comparison"
if(existsDatabase(dbName))
	dropDatabase(dbName)
	
db = database(dbName, VALUE, [2022.09.01,2022.09.02, 2022.09.03])
n = 6
t = table(n:n,[`time,`value],[DATETIME,DOUBLE])
t[`time] = [2022.09.01T00:00:00, 2022.09.01T12:00:00, 2022.09.02T00:00:00, 2022.09.02T12:00:00, 2022.09.03T00:00:00, 2022.09.03T12:00:00]
t[`value] = 1..6
pt = db.createPartitionedTable(t, `pt, `time).append!(t)

If you want to query data of 2022.09.01, you can directly compare the partitioning column with ​2022.09.01​. Even if the object and the partitioning column are of different types, it still can trigger partition pruning and only the partition 2022.09.01 will be scanned. We can use sqlDS to view how the distributed query is split into subqueries.

sqlDS(<select * from pt where time == 2022.09.01>)

To query data of 2022.09.01T00:00:00.000, you can also directly compare the partitioning column with it. Even if the partitioning column and the timestamp are of different data types, partition pruning will still be triggered and only the partition 2022.09.01 will be scanned.

sqlDS(<select * from pt where time == 2022.09.01T00:00:00.000>)

4.2 Applying an Explicit Type Conversion Before Filtering

It is common to use filters with explicit type conversion in such form: convert_func(col) <operator> constant where:

  • convert_func​ is a conversion function (e.g., date, month);
  • col​ is the partitioning column;
  • <operator>​ is an operator (<, <=, =, ==, >, >=, between, in);
  • constant is the object in the comparison.

It will trigger partition pruning in the following scenarios.

4.2.1 Using Comparison Operators

When the ​<operator>​ is a comparison operator (<, <=, =, ==, >, >=), partition pruning can be triggered ​​when the precision of the data returned by ​convert_func​ is less than or equal to the precision of ​constant​ and the partitioning column.

Here’s an example of a distributed table partitioned by date value.

dbName = "dfs://time_comparison"
if(existsDatabase(dbName))
	dropDatabase(dbName)
	
db = database(dbName, VALUE, [2022.09.01,2022.09.30,2022.10.01,2022.10.02,2022.10.31,2022.11.01,2022.11.02,2022.12.31,2023.01.01])
n = 10
t = table(n:n,[`time,`value],[DATE,DOUBLE])
t[`time] = take([2022.09.01,2022.09.30,2022.10.01,2022.10.02,2022.10.31,2022.11.01,2022.11.02,2022.12.31,2023.01.01],n)
t[`value] = rand(100.0,n)
pt = db.createPartitionedTable(t, `pt, `time).append!(t)

MONTH values can only be compared with MONTH values, for which you cannot directly compare the partitioning column of DATE type with the object of MONTH type. Instead, use the function month to convert the partitioning column to MONTH type for comparison:

select * from pt where month(time) > 2022.10M
/*
time          value
------------------------
2022.11.01  15.00570112
2022.11.02  66.54577804
2022.12.31  48.09958597
2023.01.01  50.57664175
*/

In this scenario, both operands are of MONTH type, data returned by the function month has lower precision than the partitioning column, partitions < 2022.11M are pruned.

To query data of a specific timestamp, you can use function timestamp to convert the partitioning column to TIMESTAMP type:

select * from pt where timestamp(time) = 2022.09.30T00:00:00.000
/*
time          value
------------------------
2022.09.30  19.33508650
*/

The data returned by function timestamp has higher precision than the partitioning column. Therefore, partition pruning cannot be triggered in this scenario​​. Filtering on the partitioning columns directly is more efficient for such cases.

4.2.2 Using between

When ​the <operator>​ is between, partition pruning can be triggered ​only if​ the data returned by ​convert_func​ and ​constant​ are of the same data type.

​For example, to query data of consecutive months, you cannot directly compare the partitioning column with the object of MONTH type.

select * from pt where time between month(2022.10M:2022.11M)
// between(X, Y). Temporal data comparison should have the same data type.

In such cases, you can use function month to convert the partitioning column to MONTH type for comparison with the MONTH object.

select * from pt where month(time) between month(2022.10M:2022.11M)
/*
time          value
------------------------
2022.10.01  24.45175347
2022.10.02  86.05015869
2022.10.31  78.28769609
2022.11.01  15.00570112
2022.11.02  66.54577804
*/

In this scenario, both operands of between are of MONTH type with lower precision than the partitioning column of DATE type. Partitions 2022.10.01, 2022.10.02, 2022.10.31, 2022.11.01, 2022.11.02 are scanned and others are pruned.

4.2.3 Using in

When ​the <operator>​ is in, partition pruning can be triggered when:

  • The data returned by ​convert_func​ and constant​ are of the same data type;
  • The number of continuous segments in ​constant​ is less than 16​​.

​​Here is an example of how to define the number of continuous segments:

For the list [2020.01.02, 2020.01.03, 2020.01.04, 2020.01.06, 2020.01.07, 2020.01.12], there are 3 continuous segments: 2020.01.02–2020.01.04, 2020.01.06–2020.01.07, and 2020.01.12.

To query data of specified months, you can use the function month to convert the partitioning column to MONTH type for comparison with the object of MONTH type.

select * from pt where month(time) in [2022.09M, 2022.11M]
/*
time          value
------------------------
2022.09.01  51.37807030
2022.09.01  50.86722047
2022.09.30  13.91816022
2022.11.01  76.58300183
2022.11.02  74.23354792
*/

In this scenario, both operands are of MONTH type with lower precision than the partitioning column of DATE type. The number of continuous segments in [2022.09M, 2022.11M] is less than 16​. Partitions 2022.09.01, 2022.09.30, 2022.11.01, 2022.11.02 are scanned and others are pruned.

To query data of consecutive timestamps, you can use function timestamp to convert the partitioning column to TIMESTAMP type for comparison with the object of TIMESTAMP type.

select * from pt where timestamp(time) in timestamp(2022.10.31..2022.11.01)
/*
time          value
------------------------
2022.10.31  78.28769609
2022.11.01  15.00570112
*/

In this scenario, partition pruning cannot be triggered because the precision of the data returned by the function timestamp​ is ​higher than the partitioning column.

Summary

The comparison rules for temporal types can be divided into two types: comparisons that involve partition pruning and those that do not. Typically, comparisons of temporal vectors, in-memory queries, and distributed queries filtering on non-partitioning columns do not involve partition pruning. When querying distributed tables with filter conditions on partitioning columns, comparison of temporal values will trigger partition pruning. In practical scenarios, it is recommended to use the WHERE clause that supports partition pruning to improve the query performance. It is especially effective when querying large tables, significantly reducing query time.