MySQL Plugin
DolphinDB's MySQL plugin offers high speed import of MySQL datasets or query results into DolphinDB. It supports data type conversion. Part of the plugin follows mysqlxx by Yandex.Clickhouse.
The DolphinDB MySQL plugin has the branches release 200 and release130. Each plugin version corresponds to a DolphinDB server version. You're looking at the plugin documentation for release200. If you use a different DolphinDB server version, please refer to the corresponding branch of the plugin documentation.
Build
Install a precompiled distribution
Users can import pre-compiled MySQL plug-ins (in the DolphinDB installation package or under the bin directory) with the following command in DolphinDB:
In Linux:
loadPlugin("/path/to/plugins/mysql/PluginMySQL.txt")
In Windows:
loadPlugin("C:/path/to/mysql/PluginMySQL.txt")
Note that you must load the plugin with an absolute path and replace "\" with "\\" or "/".
Compile and install
Install in Linux
For Ubuntu users, just type
$ sudo apt-get install git cmake
Then update the git submodule with the following script. This automatically downloads mariadb-connector-c source files.
$ git submodule update --init --recursive
Install cmake:
sudo apt-get install cmake
Build the project:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../path_to_mysql_plugin/
make -j`nproc`
Note: Before compiling, please make sure that libDolphinDB.so is on a path that can be found by gcc. The path can be specified with "LD_LIBRARY_PATH".
The file libPluginMySQL.so will be generated after compilation.
Install in Windows
To install in Windows, we need to compile with cmake and MinGW.
Please download cmake and MinGW. Make sure to add the bin directory to the system environment variable "Path" in MinGW.
Build the project:
mkdir build
cp libDolphinDB.dll build # copy libDolphinDB.dll to build directory
cp -r curl build # copy curl headers to build directory
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../path_to_mysql_plugin/ -G "MinGW Makefiles"
mingw32-make -j4
Note: Before compiling, copy libDolphinDB.dll and the curl header folder to "build" directory.
Users API
Note: Use loadPlugin("/path_to_PluginMySQL.txt/PluginMySQL.txt") to import MySQL plugin.
mysql::connect
Syntax
mysql::connect(host, port, user, password, db)
Parameters
host: a string indicating the address of the MySQL server.
port: an int indicating the port of the MySQL server.
user: a string indicating the user name of the MySQL server.
password: a string indicating the password of the MySQL server.
db: a string indicating the database name.
Details
Create a connection to the MySQL server. Return a handle of MySQL connection, which will be used to access the MySQL server later.
Example
conn = mysql::connect(`localhost, 3306, `root, `root, `DolphinDB)
mysql::showTables
Syntax
mysql::showTables(connection)
Parameters
connection: a MySQL connection handle created with
mysql::connect
.
Details
List all table names in a MySql database specified in mysql::connect
.
Examples
conn = mysql::connect(`localhost, 3306, `root, `root, `DolphinDB)
mysql::showTables(conn)
output:
Tables_in_DolphinDB
-------------------
US
mysql::extractSchema
Syntax
mysql::extractSchema(connection, tableName)
Parameters
connection: a MySQL connection handle created with
mysql::connect
.tableName: a string indicating the name of a table in MySQL server.
Details Generate the schema of a table.
Examples
conn = mysql::connect(`localhost, 3306, `root, `root, `DolphinDB)
mysql::extractSchema(conn, `US)
output:
name type DolphinDBType
PERMNO INT int(11)
date DATE date
SHRCD INT int(11)
TICKER SYMBOL varchar(10)
...
PRC DOUBLE double
mysql::load
Syntax
mysql::load(connection, table_or_query, [schema], [startRow], [rowNum])
Parameters
connection: a MySQL connection handle created with
mysql::connect
.table_or_query: a string indicating the name of a MySQL server table or a MySQL query.
schema: a table with names and data types of columns. If we need to change the data type of a column that is automatically determined by the system, the schema table needs to be modified and used as an argument.
startRow: an integer indicating the index of the starting row to read. If unspecified, read from the first row. If 'table_or_query' is a SQL query, then 'startRow' should be unspecified.
rowNum: an integer indicating the number of rows to read. If unspecified, read to the last row. If 'table_or_query' is a SQL query, then 'rowNum' should be unspecified.
Note: If 'table_or_query' is a SQL query, use 'LIMIT' in SQL query to specify 'startRow' and 'rowNum'.
allowEmptyTable: a Boolean indicating whether to allow importing an empty table from MySQL. The default value is false.
Details
Load a MySQL table or SQL query result into a DolphinDB in-memory table.
For details about supported data types as well as data conversion rules, please refer to the section of Data Types below.
Examples
conn = mysql::connect(`192.168.1.18, 3306, `root, `root, `DolphinDB)
tb = mysql::load(conn, `US,,0,123456)
select count(*) from tb
conn = mysql::connect(`127.0.0.1, 3306, `root, `root, `DolphinDB)
tb = mysql::load(conn, "SELECT PERMNO FROM US LIMIT 123456")
select count(*) from tb
mysql::load(conn, "SELECT now(6)");
mysql::loadEx
Syntax
mysql::loadEx(connection, dbHandle,tableName,partitionColumns,table_or_query,[schema],[startRow],[rowNum],[transform])
Parameters
connection: a MySQL connection handle created with
mysql::connect
.dbHandle and tableName: If the input data is to be saved into a distributed database, the database handle and table name should be specified.
partitionColumns: a string scalar/vector indicating partitioning column(s).
table_or_query: a string indicating the name of a MySQL server table or a MySQL query.
schema: a table with names and data types of columns. If we need to change the data type of a column that is automatically determined by the system, the schema table needs to be modified and used as an argument.
startRow: an integer indicating the index of the starting row to read. If unspecified, read from the first row. If 'table_or_query' is a SQL query, then 'startRow' should unspecified.
rowNum: an integer indicating the number of rows to read. If unspecified, read to the last row. If 'table_or_query' is a SQL query, then 'rowNum' should unspecified.
Note: If 'table_or_query' is a SQL query, use 'LIMIT' in SQL query to specify 'startRow' and 'rowNum'.
transform: apply certain transformation on a MySQL table or query before importing into DolphinDB database.
Details
Load a MySQL table as a distributed table. The result is a table object with loaded metadata.
For details about supported data types as well as data conversion rules, please refer to the Data Types section.
Examples
Load data as a partitioned table on disk.
Load the entire table
dbPath = "C:/..." db = database(dbPath, RANGE, 0 500 1000) mysql::loadEx(conn, db,`tb, `PERMNO, `US) tb = loadTable(dbPath, `tb)
Load via SQL statement
dbPath = "C:/..." db = database(dbPath, RANGE, 0 500 1000) mysql::loadEx(conn, db,`tb, `PERMNO, "SELECT * FROM US LIMIT 1000"); tb = loadTable(dbPath, `tb)
Load data as an in-memory partitioned table
Load the entire table
db = database("", RANGE, 0 50000 10000) tb = mysql::loadEx(conn, db,`tb, `PERMNO, `US)
Load via SQL statement
db = database("", RANGE, 0 50000 10000) tb = mysql::loadEx(conn, db,`tb, `PERMNO, "SELECT * FROM US LIMIT 100");
Load data as a DFS partitioned table
Load the entire table
db = database("dfs://US", RANGE, 0 50000 10000) mysql::loadEx(conn, db,`tb, `PERMNO, `US) tb = loadTable("dfs://US", `tb)
Load via SQL statement
db = database("dfs://US", RANGE, 0 50000 10000) mysql::loadEx(conn, db,`tb, `PERMNO, "SELECT * FROM US LIMIT 1000"); tb = loadTable("dfs://US", `tb)
Load and transform data into a DFS partitioned table
db = database("dfs://US", RANGE, 0 50000 10000) def replaceTable(mutable t){ return t.replaceColumn!(`svalue,t[`savlue]-1) } t=mysql::loadEx(conn, db, "",`stockid, 'select * from US where stockid<=1000000',,,,replaceTable)
Data Types
Integral
MySQL type | DolphinDB type |
---|---|
bit(1)-bit(8) | CHAR |
bit(9)-bit(16) | SHORT |
bit(17)-bit(32) | INT |
bit(33)-bit(64) | LONG |
tinyint | CHAR |
tinyint unsigned | SHORT |
smallint | SHORT |
smallint unsigned | INT |
mediumint | INT |
mediumint unsigned | INT |
int | INT |
int unsigned | LONG |
bigint | LONG |
bigint unsigned | (unsupported) LONG |
The numeric types in DolphinDB are all signed types. To prevent overflow, all unsigned types are converted to high-order signed types. For example, unsigned CHAR is converted to signed SHORT, unsigned SHORT is converted to signed INT, etc. 64-bit unsigned types are not supported.
'unsigned long long' is not supported in DolphinDB, you can specify schema and use DOUBLE or FLOAT if needed.
The smallest value of each integral type in DolphinDB is NULL value, e.g. -128 for CHAR, -32,768 for SHORT, -2,147,483,648 for INT and -9,223,372,036,854,775,808 for LONG all mean NULL values in each type respectively.
Floating-point
MySQL type | DolphinDB type |
---|---|
double | DOUBLE |
decimal | DOUBLE |
newdecimal | DOUBLE |
float | FLOAT |
Note: IEEE754 floating-point types are all signed numbers.
All floating-point types can be converted to numeric types (BOOL, CHAR, SHORT, INT, LONG, FLOAT, DOUBLE) in DolphinDB.
Time
MySQL type | DolphinDB type |
---|---|
date | DATE |
time | TIME |
datetime | DATETIME |
timestamp | TIMESTAMP |
year | INT |
All data types above can be converted to temperal data types in DolphinDB (DATE, MONTH, TIME, MINUTE, SECOND, DATETIME, TIMESTAMP, NANOTIME, NANOTIMESTAMP).
String
MySQL type | DolphinDB type |
---|---|
char (len <= 10) | SYMBOL |
varchar (len <= 10) | SYMBOL |
char (len > 10) | STRING |
varchar (len > 10) | STRING |
other string types | STRING |
char and varchar types of length less or equal to 10 will be converted to SYMBOL type in DolphinDB. Other string types will be converted to STRING type in DolphinDB.
string type will be converted to STRING or SYMBOL type in DolphinDB.
Enum
MySQL type | DolphinDB type |
---|---|
enum | SYMBOL |
enum type will be converted to SYMBOL type in DolphinDB.
Data Import Performance
Hardware
CPU: i7-7700 3.60GHZ
Hard disk: SSD, read speed 460~500MB/s.
Data
US stocks daily data from 1990 to 2016 with 22 fields and 50,591,907 rows. Total size is 6.5GB.
Time Consumed for data import
160.5 seconds