Py
The DolphinDB Py Plugin is implemented based on python C-API protocol, and you can call third-party Python libraries in DolphinDB. The plugin uses pybind11 library.
Installation (with installPlugin
)
Required server version: DolphinDB 2.00.10 or higher.
Supported OS: Linux x64 and Linux JIT. Only Python3.6 is supported.
Installation Steps:
(1) Use listRemotePlugins to check plugin information in the plugin repository.
Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.
login("admin", "123456")
listRemotePlugins(, "http://plugins.dolphindb.com/plugins/")
(2) Invoke installPlugin for plugin installation.
installPlugin("py")
(3) Add the parameter globalDynamicLib. For single-node mode, update the dolphindb.cfg file. For cluster mode, update the cluster.cfg file.
globalDynamicLib=/path_to_libpython3.6m.so/libpython3.6m.so
(4) Use loadPlugin to load the plugin before using the plugin methods.
loadPlugin("py")
Note:
- Please install Python libraries numpy and pandas before loading the plugin. Otherwise, it will raise an error:
terminate called after throwing an instance of 'pybind11::error_already_set'
what(): ModuleNotFoundError: No module named 'numpy'
- If you are using Anaconda, follow these steps to avoid conflicts with the higher version of libstdc++.so.6 in Anaconda:
- Uninstall pandas with
pip uninstall pandas
. - Reinstall pandas with
pip install pandas
. - For other modules linked to libstdc++.so.6, also use
pip uninstall
andpip install
. - Please do not use
conda install
for the installation, as it links to the higher version of libstdc++.so.6. You can also replace the libstdc++.so.6 in the DolphinDB directory with the one from the lib directory in Anaconda (with the original file backed up).
- Uninstall pandas with
Method References
toPy
Syntax
toPy(obj)
Details
The method converts an object of DolphinDB data type to a Python object. For the data type conversion, see "From DolphinDB to Python".
Parameters
- obj: A DolphinDB object.
Examples
x = 1 2 3 4;
pyArray = py::toPy(x);
y = 2 3 4 5;
d = dict(x, y);
pyDict = py::toPy(d);
fromPy
Syntax
fromPy(obj, [addIndex=false])
Details
The method converts a Python object into a DolphinDB object. For supported data types, see "From Python to DolphinDB". For an example of converting a pandas. DataFrame into a DolphinDB table while keeping the index, see "Convert DataFrame to Table with Index Retained".
Parameters
- obj: A Python object.
- addIndex: A boolean indicating whether to include the pandas.DataFrame index as the first column in the table. False (default): the index of obj will be discarded in the conversion. This parameter is required only when obj is a pandas.DataFrame.
Examples
x = 1 2 3 4;
l = py::toPy(x);
re = py::fromPy(l);
re;
importModule
Syntax
importModule(moduleName)
Details
The method imports a Python module (or a submodule). The module must have been installed in your environment. Use the pip3 list
command to check all the installed modules and use the pip3 install
command to install modules.
Note: For custom modules, copy the module file to the directory listed in the output of sys.path or to the directory where DolphinDB is located.
Parameters
- moduleName: A Python object.
- addIndex: A STRING scalar indicating the name of the module to be imported.
Examples
np = py::importModule ("numpy"); // import numpy
linear_model = py::importModule ("sklearn.linear_model"); // import sklearn submodule linear_model
Note: In Windows, when using the GUI, when the GUI during the initial module loading, you can resolve this by running importModule
in the command line. Afterward, the GUI should function normally. To import your own Python module, place the module file under the same directory as the DolphinDB server or under the "lib" directory (which you can check by calling the sys.path
function in Python). For examples, see "Import Module and Call Static Methods".
cmd
Syntax
cmd(command)
Details
The method runs a Python script in DolphinDB.
Parameters
- command : A STRING scalar indicating a python script.
Examples
sklearn = py::importModule ("sklearn"); // import sklearn
py::cmd ("from sklearn import linear_model"); // import the linear_model submodule from sklearn
getObj
Syntax
getObj(module, objName)
Details
The method gets all imported submodules of a module or gets the attributes of an object. If a submodule is not imported to DolphinDB yet, call cmd
to import it from the module.
Parameters
- module: a module which is already imported to DolphinDB, e.g., the return of
importModule
. - objName: a STRING scalar indicating the name of the object.
Examples
np = py::importModule("numpy"); //import numpy
random = py::getObj(np, "random"); ////get the random submodule of numpy
sklearn = py::importModule("sklearn"); //import sklearn
py::cmd("from sklearn import linear_model"); //import a submodule from sklearn
linear_model = py::getObj(sklearn, "linear_model"); //get the imported submodule
Note:
- The "random" submodule is automatically imported when numpy is imported, so you can directly get the "random" submodule with
getObject
. - When sklearn is imported, its submodules are not imported. Therefore, to get the submodule of sklearn, you must first import the submodule through
cmd("from sklearn import linear_model")
before calling thegetObject
method. If you only want the submodule, it's more convenient to simply use thelinear_model=py::importModule("sklearn.linear_model")
statement.
getFunc
Syntax
getFunc(module, funcName, [convert=true])
Details
The method retrieves static functions from a Python module and returns the function object, which can be executed directly in DolphinDB. Currently, keyword arguments are not supported. If convert=true is set, the function object's return will be a DolphinDB object if conversion is possible; otherwise, it will return a Python object. If convert=false is set, the function object returns a Python object.
Parameters
- module: the Python module imported previously. For example, the return value of method
importModule
orgetObj
. - funcName: a STRING scalar indicating the function to be obtained
- convert: A boolean indicating whether to convert the results of the function into DolphinDB data types automatically. The default is true.
Examples
np = py::importModule("numpy"); //import numpy
eye = py::getFunc(np, "eye"); //get function eye
np = py::importModule("numpy"); //import numpy
random = py::getObj(np, "random"); //get submodule random
randint = py::getFunc(random, "randint"); //get function randint
getInstanceFromObj
Syntax
getInstanceFromObj(obj, [args])
Details
The method constructs an instance based on the Python object obtained previously. You can access the attributes and methods of the instance with ".", which returns value of DolphinDB data type if it is convertible, otherwise returns a Python object.
Parameters
- obj: The Python object obtained previously. For example, the return value of method
getObj
. - args (optional): The arguments to be passed to the instance.
Examples
sklearn = py::importModule("sklearn");
py::cmd("from sklearn import linear_model");
linearR = py::getObj(sklearn,"linear_model.LinearRegression")
linearInst = py::getInstanceFromObj(linearR);
getInstance
Syntax
getInstance(module, objName, [args])
Details
The method gets an instance from a Python module. You can access the attributes and methods of the instance with ".
", which returns value of DolphinDB data type if it is convertible, otherwise returns a Python object.
Parameters
- obj: The Python object obtained previously. For example, the return value of method
getObj
. - args (optional): The arguments to be passed to the instance.
Examples
linear_model = py::importModule ("sklearn.linear_model"); // import submodule linear_model
linearInst = py::getInstance(linear_model,"LinearRegression")
Note: The method getFunc
obtains the static methods from a module. To call an instance method, please use pgetInstanceFromObj
or getInstance
to obtain the instance and acccess the method with ".
".
reloadModule
Syntax
reloadModule(module)
Details
If a module is modified after being imported, execute reloadModule
instead of importModule
to use the modified module.
Parameters
- module: the Python module imported previously. For example, the return value of method
importModule
.
Examples
model = py::importModule("fibo"); //fibo is a module implemented in Section 4.6
model = py::reloadModule(model); //reload the module if fibo.py is modified
Usage Examples
Load Plugin
loadPlugin("/path/to/plugin/PluginPy.txt");
use py;
Data Type Conversion to and from Python
x = 1 2 3 4;
y = 2 3 4 5;
d = dict(x, y);
pyDict = py::toPy(d);
Dict = py::fromPy(pyDict);
Dict;
Print Default Module Search Path Using Built-In Python Module
sys = py::importModule("sys");
path = py::getObj(sys, "path");
dpath = py::fromPy(path);
dpath;
Import NumPy and Call Static Methods
np = py::importModule("numpy"); //import numpy
eye = py::getFunc(np, "eye"); //get the eye function of numpy
re = eye(3); //create a diagonal matrix using eye
re;
random = py::getObj(np, "random"); //get the randome submodule of numpy
randint = py::getFunc(random, "randint"); //get the randint function of random
re = randint(0,1000,[2,3]); //execute randint
re;
Import sklearn and Call Instance Methods
//one way to get the LinearRegression method
linear_model = py::importModule("sklearn.linear_model"); //import the linear_model submodule from sklearn
linearInst = py::getInstance(linear_model,"LinearRegression")
//the other way
sklearn = py::importModule("sklearn"); //import sklearn
py::cmd("from sklearn import linear_model"); //import the linear_model submodule from sklearn
linearR = py::getObj(sklearn,"linear_model.LinearRegression")
linearInst = py::getInstanceFromObj(linearR);
X = [[0,0],[1,1],[2,2]];
Y = [0,1,2];
linearInst.fit(X, Y); //call the fit function
linearInst.coef_; //output:[0.5,0.5]
linearInst.intercept_; // output: 1.110223E-16 ~ 0
Test = [[3,4],[5,6],[7,8]];
re = linearInst.predict(Test); //call the predict function
re; //output: [3.5, 5.5, 7.5]
datasets = py::importModule("sklearn.datasets");
load_iris = py::getFunc(datasets, "load_iris"); //get the static function load_iris
iris = load_iris(); //call load_iris
datasets = py::importModule("sklearn.datasets");
decomposition = py::importModule("sklearn.decomposition");
PCA = py::getInstance(decomposition, "PCA");
py_pca=PCA.fit_transform(iris['data'].row(0:3)); //train the fir three rows of irir['data']
py_pca.row(0); //output:[0.334781147691283, -0.011991887788418, 2.926917846106032e-17]
Note: Use the row
function to access the rows of a matrix in DolphinDB. As shown in the above example, iris['data'].row(0:3)
retrieves the first three rows from iris['data']
. To retrieve the first three columns, use iris['data'][0:3]
.
Import Module and Call Static Methods
In this case we have implemented the python module with two static methods: fib(n)
prints the Fibonacci series from 0 to n; fib2(n)
returns the Fibonacci series from 0 to n.
We save the module as fibo.py and copy it to the directory where DolphinDB server is located (or to the library path printed by sys.path
).
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while a < n:
print(a, end=' ')
a, b = b, a+b
print()
def fib2(n): # return Fibonacci series up to n
result = []
a, b = 0, 1
while a < n:
result.append(a)
a, b = b, a+b
return result
Then load the plugin and use the module after importing it:
loadPlugin("/path/to/plugin/PluginPy.txt"); //load the Py plugin
fibo = py::importModule("fibo"); //import the module fibo
fib = py::getFunc(fibo,"fib"); //get the fib function
fib(10); //call fib function, print 0 1 1 2 3 5 8
fib2 = py::getFunc(fibo,"fib2"); //get the fib2 function in the module
re = fib2(10); //call the fib2 function
re; //output: 0 1 1 2 3 5 8
Convert DataFrame to Table with Index Retained
When calling Python functions that return a DataFrame, you can set convert=false of the getFunc
function if you want to keep the index as the first column of the converted table. Use the fromPy
function to convert the result to a table (set addIndex=true).
Implement a function returned pandas.DataFrame. Save the result as demo.py and copy it to the directory where DolphinDB server is located (or to the library path printed by sys.path
).
import pandas as pd
import numpy as np
def createDF():
index=pd.Index(['a','b','c'])
df=pd.DataFrame(np.random.randint(1,10,(3,3)),index=index)
return df
Load the plugin and import the module demo. Use the function to keep the index of the DataFrame as the first column of the result.
loadPlugin("/path/to/plugin/PluginPy.txt"); //load the Py plugin
model = py::importModule("demo");
func1 = py::getFunc(model, "createDF", false)
tem = func1()
re = py::fromPy(tem, true)
Data Type Mappings
From DolphinDB to Python
DolphinDB Form | Python Form |
---|---|
BOOL | bool |
CHAR | int64 |
SHORT | int64 |
INT | int64 |
LONG | int64 |
DOUBLE | float64 |
FLOAT | float64 |
STRING | String |
DATE | datetime64[D] |
MONTH | datetime64[M] |
TIME | datetime64[ms] |
MINUTE | datetime64[m] |
SECOND | datetime64[s] |
DATETIME | datetime64[s] |
TIMESTAMP | datetime64[ms] |
NANOTIME | datetime64[ns] |
NANOTIMESTAMP | datetime64[ns] |
DATEHOUR | datetime64[s] |
vector | NumPy.array |
matrix | NumPy.array |
set | Set |
dictionary | Dictionary |
table | pandas.DataFrame |
- DolphinDB CHAR types are converted into Python int64 type.
- Vectors and matrices are converted to numpy.array, and time types are converted to the time types used in Python's pandas library. It is necessary to have both the numpy and pandas modules installed in your Python environment.
- As the temporal types in Python pandas are datetime64[ns], all DolphinDB temporal types are converted into datetime64[ns] type, see also: https://github.com/pandas-dev/pandas/issues/6741#issuecomment-39026803 . MONTH type such as 2012.06M is converted into 2012-06-01 (the first day of the month). TIME, MINUTE, SECOND and NANOTIME types do not include information about date. 1970-01-01 is automatically added during conversion. For example, 13:30m is converted into 1970-01-01 13:30:00.
- NULLs of logical, temporal and numeric types are converted into NaN or NaT; NULLs of string types are converted into empty strings. If a vector contains NULL values, the data type may change. For example, if a vector of Boolean type contains NULL values, NULL will be converted to NaN. As a result, the vector's data type will be converted to float64, and the TRUE and False values will be converted to 1 and 0, respectively.
From Python to DolphinDB
Python Form | DolphinDB Form |
---|---|
bool | BOOL |
int8 | CHAR |
int16 | SHORT |
int32 | INT |
int64 | LONG |
float32 | FLOAT |
float64 | DOUBLE |
String | STRING |
datetime64[M] | MONTH |
datetime64[D] | DATE |
datetime64[m] | MINUTE |
datetime64[s] | DATETIME |
datetime64[h] | DATEHOUR |
datetime64[ms] | TIMESTAMP |
datetime64[us] | NANOTIMESTAMP |
datetime64[ns] | NANOTIMESTAMP |
Tuple | vector |
List | vector |
Dictionary | dictionary |
Set | set |
NumPy.array | vector (1d) /matrix (2d) |
pandas.DataFrame | table |
- The numpy.array will be converted to DolphinDB vector (1d) or matrix (2d) based on its dimension.
- As the only temporal data type in Python pandas is datetime64, all temporal columns of a DataFrame are converted into NANOTIMESTAMP type
- When a pandas.DataFrame is converted to DolphinDB table, if the column name is not supported in DolphinDB, it will be adjusted based on the following rules:
- If special characters except for letters, digits or underscores are contained in the column names, they are converted to underscores.
- If the first character is not a letter, "c" is added as the first character of the column name.