Data Type

Table 1. The following table summarizes the atomic data types that DolphinDB supports:
Data Type Name ID Examples Symbol Size Category Range
VOID 0 NULL 1 Void
BOOL 1 1b, 0b, true, false b 1 Logical 0~1
CHAR 2 'a', 97c c 1 Integral -2 7 +1~2 7 -1
SHORT 3 122h h 2 Integral -2 15 +1~2 15 -1
INT 4 21 i 4 Integral -2 31 +1~2 31 -1
LONG 5 22l l 8 Integral -2 63 +1~2 63 -1
DATE 6 2013.06.13 d 4 Temporal
MONTH 7 2012.06M M 4 Temporal
TIME 8 13:30:10.008 t 4 Temporal
MINUTE 9 13:30m m 4 Temporal
SECOND 10 13:30:10 s 4 Temporal
DATETIME 11 2012.06.13 13:30:10 or 2012.06.13T13:30:10 D 4 Temporal [1901.12.13T20:45:53, 2038.01.19T03:14:07]
TIMESTAMP 12 2012.06.13 13:30:10.008 or 2012.06.13T13:30:10.008 T 8 Temporal
NANOTIME 13 13:30:10.008007006 n 8 Temporal
NANOTIMESTAMP 14 2012.06.13 13:30:10.008007006 or 2012.06.13T13:30:10.008007006 N 8 Temporal [1677.09.21T00:12:43.145224193, 2262.04.11T23:47:16.854775807]
FLOAT 15 2.1f f 4 Floating Sig. Fig. 06-09
DOUBLE 16 2.1 F 8 Floating Sig. Fig. 15-17
SYMBOL 17 S 4 Literal
STRING 18 "Hello" or 'Hello' or `Hello W ≤ 65,535 Literal
UUID 19 5d212a78-cc48-e3b1-4235-b4d91473ee87 16 Literal
FUNCTIONDEF 20 def f1(a,b) {return a+b;} System
HANDLE 21 file handle, socket handle, and db handle System
CODE 22 <1+2> System
DATASOURCE 23 System
RESOURCE 24 System
ANY 25 (1,2,3) Mixed
COMPRESS 26 1 Integral -2 7 +1~2 7 -1
ANY DICTIONARY 27 {a:1,b:2} Mixed
DATEHOUR 28 2012.06.13T13 4 Temporal
IPADDR 30 192.168.1.13 16 Literal
INT128 31 e1671797c52e15f763380b45e841ec32 16 Integral -2 127 +1~2 127 -1
BLOB 32 ≤ 4,194,304 Literal
COMPLEX 34 2.3+4.0i 16
POINT 35 (117.60972, 24.118418) 16
DURATION 36 1s, 3M, 5y, 200ms 4 System
DECIMAL32(S) 37 3.1415926$DECIMAL32(3) 4 Decimal (-1*10^(9-S), 1*10^(9-S))
DECIMAL64(S) 38 3.1415926$DECIMAL64(3), , 3.141P P 8 Decimal (-1*10^(18-S), 1*10^(18-S))
DECIMAL128(S) 39 3.1415926$DECIMAL128(3) 16 Decimal (-1*10^(38-S), 1*10^(38-S))

Note:

  1. SYMBOL is a special STRING type.
  2. ANY DICTIONARY is the data type in DolphinDB for JSON.
  3. COMPRESS can only be generated with function compress .
  4. The DURATION type can be generated with function duration or by combining an integer with a unit of time (case sensitive): y, M, w, d, B, H, m, s, ms, us, ns. The range of a DURATION value is -2 31 +1~2 31 -1. If a data type overflow occurs, the data is treated as NULL value.
  5. DolphinDB uses IEEE 754 standard for the data types DOUBLE and FLOAT. If a data type overflow occurs, the data is treated as NULL value.
  6. Only in-memory tables or DFS tables in TSDB databases support BLOB types.
  7. The character "S" of DECIMAL32(S), DECIMAL64(S) and DECIMAL128(S) means scale, which determines how many decimal digits a fraction can have. The value range of S for DECIMAL32(S) is [0,9], for DECIMAL64(S) is [0,18], for DECIMAL128(S) is [0,38]. DECIMAL32 is stored as the int32_t type and takes 4 bytes; DECIMAL64 is stored as the int64_t type and takes 8 bytes; DECIMAL128 is stored as the int128_t type and takes 16 bytes. DECIMAL(0) can represent integers in the range of [-999,999,999, 999,999,999], while the 4-byte integer (INT32) is in the range of [-2,147,483,648, 2,147,483,647]. Therefore, if the integral range of a numeric value exceeds the valid range of DECIMAL32 but is within the range of [-2147483648, 2147483647], it can still be converted to DECIMAL 32. However, when converting a string to DECIMAL32, if its length exceeds the range, the system raises an exception.
decimal32(1000000000, 0)
            // output
            1000000000
            
            decimal32(`1000000000, 0)
            // output
            Convert string to DECIMAL failed: Decimal math overflow

Type check

Use functions typestr and type to check data types. The function typestr returns a string; the function type returns an integer.
typestr 3l;
// output
LONG

type 3l;
// output
5

x=3;
if(type(x) == INT){y=10};
y;

// output
10

Data Range

The range for integral data types are listed in the table above. For each of them, the mininum allowed value minus 1 represents the corresponding NULL value. For example, -128c is a NULL character. For NULL values please see Null Value Manipulation.
x=-128c;
x;
// output
00c
typestr x;
// output
CHAR

Data type symbols

A data type symbol is used for declaring a data type of a constant. In the example below, without specifying a data type symbol, number 3 is stored in memory by default as an integer. If you would like to save it as a floating number, it should be declared as 3f(float) or 3F(double).

typestr 3;
// output
INT

typestr 3f;
// output
FLOAT

typestr 3F;
// output
DOUBLE

typestr 3l;
// output
LONG

typestr 3h;
// output
SHORT

typestr 3c;
// output
CHAR

typestr 3b;
// output
BOOL

typestr 3P;   // New in version 2.00.9.4
// output
DECIMAL64

Symbol and String

In some circumstances it might be optimal to save strings as SYMBOL types in DolphinDB. SYMBOL types are stored as integers in DolphinDB to allow more efficient sorting and comparison. Therefore, SYMBOL types could potentially improve operating performance and save storage space. On the other hand, mapping strings to integers (hashing) takes time and the hash table consumes memory.

The following rules could help you decide whether to use SYMBOL types or not:

  • Avoid using SYMBOL types if the data will not be sorted, searched or compared.
  • Avoid using SYMBOL types if there are few duplicate values.
Two specific cases:
  • Stock tickers in a trades or quotes table should use SYMBOL types because a stock usually has a large amount of rows in these tables, and because stocks tickers are frequently searched and compared.
  • Descriptive fields should not use SYMBOL types because description seldom repeats and is rarely searched, sorted or compared.

Example 1: Sorting a symbol vector with 3 million records is 40 times faster than that of the same sized string vector.

            n=3000000
            strs=array(STRING,0,n)
            strs.append!(rand(`IBM`C`MS`GOOG, n))
            timer sort strs;
            // output
            Time elapsed: 482.027 ms
            
            n=3000000
            syms=array(SYMBOL,0,n)
            syms.append!(rand(`IBM`C`MS`GOOG, n))
            timer sort syms;
            // output
            Time elapsed: 12.001 ms
        

Example 2: Comparing a symbol vector with 3 million records is almost 15 times as fast as comparing the same sized string vector.

timer(100){strs>`C};
            // output
            Time elapsed: 4661.26 ms
            
            timer(100){syms>`C};
            // output
            Time elapsed: 322.655 ms
        

Symbol vector creation

(1) With function array
syms=array(SYMBOL, 0, 100);
// create an empty symbol array;

typestr syms;
// output
FAST SYMBOL VECTOR
syms.append!(`IBM`C`MS);
syms;
// output
["IBM","C","MS"]
(2) With type conversion
syms=`IBM`C`MS;
typestr syms;
// output
STRING VECTOR

// converting to a symbol vector;
sym=syms$SYMBOL;

typestr sym;
// output
FAST SYMBOL VECTOR
typestr syms;
// output
STRING VECTOR
(3) With function rand
syms=`IBM`C`MS;
symRand=rand(syms, 10);
//generate a random SYMBOL vector

symRand;
// output
["IBM","IBM","IBM","MS","C","C","MS","IBM","C","MS"]
typestr symRand;
// output
FAST SYMBOL VECTOR
Note: The rand function takes a string vector and generates a symbol vector. The rand function doesn't change any other input data types. We intentionally make this exception as when users generate a random vector based on a string vector, in most cases they would like to get a symbol vector.