Kdb+
kdb+ is a time series database featuring relational and columnar design, which can be used to store, analyze, process, and query large-scale datasets. With the DolphinDB kdb+ plugin, you can load the kdb+ tables on your disk into DolphinDB as in-memory tables. This plugin can be used to load all types of Q data structures in DolphinDB. There are two load options: through the loadTable
method or the loadFile
method.
1. Installation (with installPlugin
)
Required server version: DolphinDB 2.00.10 or higher
Supported OS: Windows x86-64, Linux x86-64, and Linux JIT.
Installation Steps:
(1) Use listRemotePlugins to check plugin information in the plugin repository.
Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.
login("admin", "123456")
listRemotePlugins()
(2) Invoke installPlugin for plugin installation
installPlugin("kdb")
(3) Use loadPlugin to load the plugin before using the plugin methods.
loadPlugin("kdb")
2. Method References
connect
Syntax
connect(host, port, \[usernamePassword])
Details
Establish a connection to the kdb+ server. Return a connection handle.
If the connection fails, an exception is thrown. Possible causes are:
- The username or password is wrong;
- The port number for the server does not exist;
- Timeout.
Parameters
- host: A STRING scalar indicating the IP address where the kdb+ database is running.
- port: An integer indicating the port that the kdb+ database is listening on.
- usernamePassword: A STRING scalar indicating the username and password for the kdb+ database to connect to. Specify the value in this format: "username:password". If the kdb+ database does not require authentication, leave this parameter empty or specify an arbitrary string.
Examples
Suppose the username and password ("admin:123456") are stored in ../passwordfiles/usrs, and the kdb+ server and DolphinDB server are on the same machine:
kdb+ shell:
q -p 5000 -U ../passwordfiles/usrs // "-U" must be capitalized
DolphinDB shell:
handle = kdb::connect("127.0.0.1", 5000, "admin:123456")
If the kdb+ database you're connecting to does not require authentication:
handle = kdb::connect("127.0.0.1", 5000)
loadTable
Syntax
loadTable(handle, tablePath, symPath)
Details
Load data from a connected kdb+ database as an in-memory table in DolphinDB.
In kdb+, symbols containing few distinctive values are saved as integers in sym files through enumeration to reduce storage requirements. Therefore, to load a table containing an enumerated SYMBOL column, the system must load the associated sym file first.
Parameters
- handle: The connection handle returned by
kdb::loadTable
. - tablePath: A STRING scalar indicating the path to the kdb+ table you're loading. If it is a splayed table, partitioned table, or segmented table, specify the directory of the table file. If it is a single object, specify the table file.
- symPath (optional): A STRING scalar indicating the path to the sym file for the table, which is empty by default. Leave this parameter empty only when the table does not contain an enumerated column of type symbol.
Note: It is recommended to separate the paths with a slash ("/").
Examples
// The table contains an enumerated symbol column
DATA_DIR="/path/to/data/kdb_sample"
Txns = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Txns", DATA_DIR + "/sym")
// There's no symbol data in the splayed table, or the symbol column is not enumerated in a single table
DATA_DIR="/path/to/data/kdb_sample"
Txns = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Txns", DATA_DIR)
loadFile
Syntax
loadFile(tablePath, symPath)
Details
Directly read the specified kdb+ data files on disk and load the file data to DolphinDB as an in-memory table.
In kdb+, symbols containing few distinctive values are saved as integers in sym files through enumeration to reduce storage requirements. Therefore, to load a table containing an enumerated column of type symbol, the system must load the associated sym file first.
Parameters
- tablePath: A STRING scalar indicating the path to the kdb+ table you're loading. It must be the directory of the table file of a splayed table, partitioned table, or segmented table.
- symPath (optional): A STRING scalar indicating the path to the sym file for the table. Leave this parameter empty only when the table does not contain an enumerated column of type symbol.
Note: It is recommended to separate the paths with a slash ("/").
Examples
// The table contains an enumerated symbol column
DATA_DIR="/path/to/data/kdb_sample"
Txns = kdb::loadFile(handle, DATA_DIR + "/2022.06.17/Txns", DATA_DIR + "/sym")
// There's no symbol data in the splayed table, or the symbol column is not enumerated in a single table
DATA_DIR="/path/to/data/kdb_sample"
Txns = kdb::loadFile(handle, DATA_DIR + "/2022.06.17/Txns", DATA_DIR)
close
Syntax
close(handle)
Details
Close the connection to the kdb+ server.
Parameters
- handle: The connection handle returned by
kdb::loadTable
.
Examples
kdb::close(handle)
3. Usage Examples
loadPlugin("/home/DolphinDBPlugin/kdb/build/PluginKDB.txt")
go
// connect to the kdb+ database
handle = kdb::connect("127.0.0.1", 5000, "admin:123456")
// specify the file path
DATA_DIR="/home/kdb/data/kdb_sample"
// Load data to DolphinDB through loadTable
Daily = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Daily/", DATA_DIR + "/sym")
Minute = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Minute", DATA_DIR + "/sym")
Ticks = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Ticks/", DATA_DIR + "/sym")
Orders = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Orders", DATA_DIR + "/sym")
Syms = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Syms/", DATA_DIR + "/sym")
Txns = kdb::loadTable(handle, DATA_DIR + "/2022.06.17/Txns", DATA_DIR + "/sym")
kdb::close(handle)
// DolphinDB Load data to DolphinDB by reading disk files
Daily2 = kdb::loadFile(DATA_DIR + "/2022.06.17/Daily", DATA_DIR + "/sym")
Minute2= kdb::loadFile(DATA_DIR + "/2022.06.17/Minute/", DATA_DIR + "/sym")
Ticks2 = kdb::loadFile(DATA_DIR + "/2022.06.17/Ticks/", DATA_DIR + "/sym")
Orders2 = kdb::loadFile(DATA_DIR + "/2022.06.17/Orders/", DATA_DIR + "/sym")
Syms2 = kdb::loadFile(DATA_DIR + "/2022.06.17/Syms/", DATA_DIR + "/sym")
Txns2 = kdb::loadFile(DATA_DIR + "/2022.06.17/Txns/", DATA_DIR + "/sym")
4. Load Options
4.1 loadTable
For this option, call the plugin methods in the following sequence: connect()
→ loadTable()
→ close()
.
Note:
- Please make sure that the table to be loaded doesn't contain nested columns.
- The tablePath parameter of
loadTable
must be a single object file, or the directory of a splayed table, partitioned table, or segmented table.
4.2 loadFile
For this option, you only need to call the loadFile()
method.
Note:
- This method cannot read the file of a single object.
- This method only reads data compressed by gzip.
- Please make sure that the table to be loaded doesn't contain nested columns.
- It is recommended to use
loadTable
to load tables containing sorted, unique, partitioned, or true index columns.
5. Load Different Types of kdb+ Tables
5.1 Single Object
A single object can only be loaded using the loadTable()
method. For example:
The directory structure:
path/to/data
├── sym
└── table_name
handle = kdb::connect("127.0.0.1", 5000, "username:password");
table = kdb::loadTable(handle, "path/to/data/table_name", "path/to/data/sym");
5.2 Splayed Table
A splayed table can be loaded using the loadTable()
or loadFile()
method.
If the table is compressed using gzip or not compressed as it is written to disk, it is recommended to use loadFile()
for higher efficiency.
For example:
The directory structure:
path/to/data
├── sym
└── table_name
├── date
├── p
└── ti
handle = kdb::connect("127.0.0.1", 5000, "username:password");
table1 = kdb::loadTable(handle, "path/to/data/table_name/", "path/to/data/sym");
table2 = kdb::loadTable("path/to/data/table_name/", "path/to/data/sym");
5.3 Partitioned Table
Currently, the kdb+ plugin doesn't support loading an entire partitioned table or database by specifying the root directory. Alternatively, you can specify the tablePath parameter as the path to the table in each partition to load the tables separately, then combine them into a complete partitioned table in DolphinDB through scripts.
For example:
the directory structure:
path/to/data
├── sym
├── 2019.01.01
│ └── table_name
│ ├── p
│ └── ti
├── 2019.01.02
│ └── table_name
│ ├── p
│ └── ti
└── 2019.01.03
└── table_name
├── p
└── ti
// get the information on all files under the directory
fileRes=files("path/to/data");
// delete the sym files and read the data files
delete from fileRes where filename='sym';
name='table_name';
files = exec filename from fileRes;
// create an in-memory table and specify its schema
table=table(10:0,`p`ti`date, [SECOND,DOUBLE,DATE])
// load the data in each partition
for (file in files) {
t = kdb::loadFile("path/to/data" +'/' + file +'/' + tablename + '/');
// add a column indicating the partition name to the loaded data
addColumn(t, ["date"],[DATE])
length=count(t)
newCol=take(date(file), length)
replaceColumn!(t, "date", newCol)
// append the data to the in-memory table
table.append!(t);
}
5.4 Segmented Table
A segmented table can be loaded to DolphinDB in the same way as a partitioned table.
6. Data Type Mappings
6.1 Basic Data Types of kdb+
The following are the data type mappings when a kdb+ table is imported to DolphinDB.
kdb+ | DolphinDB | Size | Note |
---|---|---|---|
boolean | BOOL | 1 | |
guid | UUID | 16 | |
byte | CHAR | 1 | There is no byte type in DolphinDB. Byte data is converted to CHAR of the same length. |
short | SHORT | 2 | |
int | INT | 4 | |
long | LONG | 8 | |
real | FLOAT | 4 | |
float | DOUBLE | 8 | |
char | CHAR | 1 | There will be no conversion as a null char ("") is treated as a space (" ") in kdb+. |
symbol | SYMBOL | 4 | |
timestamp | NANOTIMESTAMP | 8 | |
month | MONTH | 4 | |
data | DATE | 4 | |
datetime | TIMESTAMP | 8 | |
timespan | NANOTIME | 8 | |
minute | MINUTE | 4 | |
second | SECOND | 4 | |
time | TIME | 4 |
6.2 Nested List Data Types of kdb+
The following are the data type mappings when a kdb+ nested list is imported to DolphinDB.
kdb+ | DolphinDB | Size | Note |
---|---|---|---|
char nested list | STRING | No longer than 65535 | |
boolean | BOOL | ||
guid | UUID | ||
byte | CHAR | ||
short | SHORT | ||
int | INT | ||
long | LONG | ||
real | FLOAT | ||
float | DOUBLE | ||
char | CHAR | ||
symbol | ANY | ||
timestamp | NANOTIMESTAMP | ||
month | MONTH | ||
date | DATE | ||
datetime | TIMESTAMP | ||
timespan | NANOTIME | ||
minute | MINUTE | ||
second | SECOND | ||
time | TIME |
Note:
- CHAR nested lists are commonly used to store strings. The DolphinDB kdb+ plugin supports the conversion from char nested lists to DolphinDB STRING type.
- Other types of nested lists are converted to array vectors of the corresponding type.
- Since there is no STRING array vector in DolphinDB, SYMBOL nested lists are converted to ANY vectors.
- The nested list data in the kdb+ serialized file is also converted to ANY vectors since its data type may be undetermind.