Py

The DolphinDB Py Plugin is implemented based on python C-API protocol, and you can call third-party Python libraries in DolphinDB. The plugin uses pybind11 library.

Installation (with installPlugin)

Required server version: DolphinDB 2.00.10 or higher.

Supported OS: Linux JIT.

Python3.6 is supported.

Installation Steps:

(1) Use listRemotePlugins to check plugin information in the plugin repository.

Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.

login("admin", "123456")
listRemotePlugins(, "http://plugins.dolphindb.com/plugins/")

(2) Invoke installPlugin for plugin installation.

installPlugin("py")

(3) Add the parameter globalDynamicLib. For single-node mode, update the dolphindb.cfg file. For cluster mode, update the cluster.cfg file.

globalDynamicLib=/path_to_libpython3.6m.so/libpython3.6m.so

(4) Use loadPlugin to load the plugin before using the plugin methods.

loadPlugin("py")

Note:

  • Please install Python libraries numpy and pandas before loading the plugin. Otherwise, it will raise an error:
 terminate called after throwing an instance of 'pybind11::error_already_set'
   what():  ModuleNotFoundError: No module named 'numpy'
  • If you are using Anaconda, follow these steps to avoid conflicts with the higher version of libstdc++.so.6 in Anaconda:
    • Uninstall pandas with pip uninstall pandas.
    • Reinstall pandas with pip install pandas.
    • For other modules linked to libstdc++.so.6, also use pip uninstall and pip install.
    • Please do not use conda install for the installation, as it links to the higher version of libstdc++.so.6. You can also replace the libstdc++.so.6 in the DolphinDB directory with the one from the lib directory in Anaconda (with the original file backed up).

Method References

toPy

Syntax

toPy(obj)

Details

The method converts an object of DolphinDB data type to a Python object. For the data type conversion, see From DolphinDB to Python.

Parameters

  • obj: A DolphinDB object.

Examples

 x = 1 2 3 4;
 pyArray = py::toPy(x);

 y = 2 3 4 5;
 d = dict(x, y);
 pyDict = py::toPy(d);

fromPy

Syntax

fromPy(obj, [addIndex=false])

Details

The method converts a Python object into a DolphinDB object. For supported data types, see From Python to DolphinDB. For an example of converting a pandas. DataFrame into a DolphinDB table while keeping the index, see Convert DataFrame to Table with Index Retained.

Parameters

  • obj: A Python object.
  • addIndex: A boolean indicating whether to include the pandas.DataFrame index as the first column in the table. False (default): the index of obj will be discarded in the conversion. This parameter is required only when obj is a pandas.DataFrame.

Examples

 x = 1 2 3 4;
 l = py::toPy(x);
 re = py::fromPy(l);
 re;

importModule

Syntax

importModule(moduleName)

Details

The method imports a Python module (or a submodule). The module must have been installed in your environment. Use the pip3 list command to check all the installed modules and use the pip3 install command to install modules.

Note: For custom modules, copy the module file to the directory listed in the output of sys.path or to the directory where DolphinDB is located.

Parameters

  • moduleName: A Python object.
  • addIndex: A STRING scalar indicating the name of the module to be imported.

Examples

np = py::importModule ("numpy"); // import numpy

linear_model = py::importModule ("sklearn.linear_model"); // import sklearn submodule linear_model

Note: In Windows, when using the GUI, when the GUI during the initial module loading, you can resolve this by running importModule in the command line. Afterward, the GUI should function normally. To import your own Python module, place the module file under the same directory as the DolphinDB server or under the "lib" directory (which you can check by calling the sys.path function in Python). For examples, see Import Module and Call Static Methods.

cmd

Syntax

cmd(command)

Details

The method runs a Python script in DolphinDB.

Parameters

  • command : A STRING scalar indicating a python script.

Examples

 sklearn = py::importModule ("sklearn"); // import sklearn
 py::cmd ("from sklearn import linear_model"); // import the linear_model submodule from sklearn

getObj

Syntax

getObj(module, objName)

Details

The method gets all imported submodules of a module or gets the attributes of an object. If a submodule is not imported to DolphinDB yet, call cmd to import it from the module.

Parameters

  • module: a module which is already imported to DolphinDB, e.g., the return of importModule.
  • objName: a STRING scalar indicating the name of the object.

Examples

np = py::importModule("numpy"); //import numpy
random = py::getObj(np, "random"); ////get the random submodule of numpy

sklearn = py::importModule("sklearn"); //import sklearn
py::cmd("from sklearn import linear_model"); //import a submodule from sklearn
linear_model = py::getObj(sklearn, "linear_model"); //get the imported submodule

Note:

  • The "random" submodule is automatically imported when numpy is imported, so you can directly get the "random" submodule with getObject.
  • When sklearn is imported, its submodules are not imported. Therefore, to get the submodule of sklearn, you must first import the submodule through cmd("from sklearn import linear_model") before calling the getObject method. If you only want the submodule, it's more convenient to simply use the linear_model=py::importModule("sklearn.linear_model") statement.

getFunc

Syntax

getFunc(module, funcName, [convert=true])

Details

The method retrieves static functions from a Python module and returns the function object, which can be executed directly in DolphinDB. Currently, keyword arguments are not supported. If convert=true is set, the function object's return will be a DolphinDB object if conversion is possible; otherwise, it will return a Python object. If convert=false is set, the function object returns a Python object.

Parameters

  • module: the Python module imported previously. For example, the return value of method importModule or getObj.
  • funcName: a STRING scalar indicating the function to be obtained
  • convert: A boolean indicating whether to convert the results of the function into DolphinDB data types automatically. The default is true.

Examples

np = py::importModule("numpy"); //import numpy
eye = py::getFunc(np, "eye"); //get function eye

np = py::importModule("numpy"); //import numpy
random = py::getObj(np, "random"); //get submodule random
randint = py::getFunc(random, "randint"); //get function randint

getInstanceFromObj

Syntax

getInstanceFromObj(obj, [args])

Details

The method constructs an instance based on the Python object obtained previously. You can access the attributes and methods of the instance with ".", which returns value of DolphinDB data type if it is convertible, otherwise returns a Python object.

Parameters

  • obj: The Python object obtained previously. For example, the return value of method getObj.
  • args (optional): The arguments to be passed to the instance.

Examples

 sklearn = py::importModule("sklearn");
 py::cmd("from sklearn import linear_model");
 linearR = py::getObj(sklearn,"linear_model.LinearRegression")
 linearInst = py::getInstanceFromObj(linearR);

getInstance

Syntax

getInstance(module, objName, [args])

Details

The method gets an instance from a Python module. You can access the attributes and methods of the instance with ".", which returns value of DolphinDB data type if it is convertible, otherwise returns a Python object.

Parameters

  • obj: The Python object obtained previously. For example, the return value of method getObj.
  • args (optional): The arguments to be passed to the instance.

Examples

linear_model = py::importModule ("sklearn.linear_model"); // import submodule linear_model
linearInst = py::getInstance(linear_model,"LinearRegression")

Note: The method getFunc obtains the static methods from a module. To call an instance method, please use pgetInstanceFromObj or getInstance to obtain the instance and acccess the method with ".".

reloadModule

Syntax

reloadModule(module)

Details

If a module is modified after being imported, execute reloadModule instead of importModule to use the modified module.

Parameters

  • module: the Python module imported previously. For example, the return value of method importModule.

Examples

model = py::importModule("fibo"); //fibo is a module implemented in Section 4.6

model = py::reloadModule(model);  //reload the module if fibo.py is modified 

Usage Examples

Load Plugin

loadPlugin("/path/to/plugin/PluginPy.txt");
use py;

Data Type Conversion to and from Python

x = 1 2 3 4;
y = 2 3 4 5;
d = dict(x, y);
pyDict = py::toPy(d);
Dict = py::fromPy(pyDict);
Dict;

Import NumPy and Call Static Methods

np = py::importModule("numpy"); //import numpy
eye = py::getFunc(np, "eye"); //get the eye function of numpy
re = eye(3); //create a diagonal matrix using eye
re;

random = py::getObj(np, "random"); //get the randome submodule of numpy
randint = py::getFunc(random, "randint"); //get the randint function of random
re = randint(0,1000,[2,3]); //execute randint
re;

Import sklearn and Call Instance Methods

//one way to get the LinearRegression method
linear_model = py::importModule("sklearn.linear_model"); //import the linear_model submodule from sklearn
linearInst = py::getInstance(linear_model,"LinearRegression") 
//the other way
sklearn = py::importModule("sklearn"); //import sklearn
py::cmd("from sklearn import linear_model"); //import the linear_model submodule from sklearn
linearR = py::getObj(sklearn,"linear_model.LinearRegression")
linearInst = py::getInstanceFromObj(linearR);

X = [[0,0],[1,1],[2,2]];
Y = [0,1,2];
linearInst.fit(X, Y); //call the fit function
linearInst.coef_; //output:[0.5,0.5]
linearInst.intercept_; // output: 1.110223E-16 ~ 0
Test = [[3,4],[5,6],[7,8]];
re = linearInst.predict(Test); //call the predict function
re; //output: [3.5, 5.5, 7.5]

datasets = py::importModule("sklearn.datasets");
load_iris = py::getFunc(datasets, "load_iris"); //get the static function load_iris
iris = load_iris(); //call load_iris

datasets = py::importModule("sklearn.datasets");
decomposition = py::importModule("sklearn.decomposition");
PCA = py::getInstance(decomposition, "PCA");
py_pca=PCA.fit_transform(iris['data'].row(0:3)); //train the fir three rows of irir['data']
py_pca.row(0);  //output:[0.334781147691283, -0.011991887788418, 2.926917846106032e-17]

Note: Use the row function to access the rows of a matrix in DolphinDB. As shown in the above example, iris['data'].row(0:3) retrieves the first three rows from iris['data']. To retrieve the first three columns, use iris['data'][0:3].

Import Module and Call Static Methods

In this case we have implemented the python module with two static methods: fib(n) prints the Fibonacci series from 0 to n; fib2(n) returns the Fibonacci series from 0 to n.

We save the module as fibo.py and copy it to the directory where DolphinDB server is located (or to the library path printed by sys.path).

def fib(n):    # write Fibonacci series up to n
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b
    print()

def fib2(n):   # return Fibonacci series up to n
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a)
        a, b = b, a+b
    return result

Then load the plugin and use the module after importing it:

loadPlugin("/path/to/plugin/PluginPy.txt"); //load the Py plugin

fibo = py::importModule("fibo");  //import the module fibo
fib = py::getFunc(fibo,"fib");  //get the fib function
fib(10);  //call fib function, print 0 1 1 2 3 5 8
fib2 = py::getFunc(fibo,"fib2"); //get the fib2 function in the module
re = fib2(10);  //call the fib2 function
re;   //output: 0 1 1 2 3 5 8

Convert DataFrame to Table with Index Retained

When calling Python functions that return a DataFrame, you can set convert=false of the getFunc function if you want to keep the index as the first column of the converted table. Use the fromPy function to convert the result to a table (set addIndex=true).

Implement a function returned pandas.DataFrame. Save the result as demo.py and copy it to the directory where DolphinDB server is located (or to the library path printed by sys.path).

import pandas as pd
import numpy as np
def createDF():
    index=pd.Index(['a','b','c'])
    df=pd.DataFrame(np.random.randint(1,10,(3,3)),index=index)
    return df

Load the plugin and import the module demo. Use the function to keep the index of the DataFrame as the first column of the result.

loadPlugin("/path/to/plugin/PluginPy.txt"); //load the Py plugin

model = py::importModule("demo");
func1 = py::getFunc(model, "createDF", false)
tem = func1()
re =  py::fromPy(tem, true)

Data Type Mappings

From DolphinDB to Python

DolphinDB FormPython Form
BOOLbool
CHARint64
SHORTint64
INTint64
LONGint64
DOUBLEfloat64
FLOATfloat64
STRINGString
DATEdatetime64[D]
MONTHdatetime64[M]
TIMEdatetime64[ms]
MINUTEdatetime64[m]
SECONDdatetime64[s]
DATETIMEdatetime64[s]
TIMESTAMPdatetime64[ms]
NANOTIMEdatetime64[ns]
NANOTIMESTAMPdatetime64[ns]
DATEHOURdatetime64[s]
vectorNumPy.array
matrixNumPy.array
setSet
dictionaryDictionary
tablepandas.DataFrame
  • DolphinDB CHAR types are converted into Python int64 type.
  • Vectors and matrices are converted to numpy.array, and time types are converted to the time types used in Python's pandas library. It is necessary to have both the numpy and pandas modules installed in your Python environment.
  • As the temporal types in Python pandas are datetime64, all DolphinDB temporal types are converted into datetime64 type, see also: https://github.com/pandas-dev/pandas/issues/6741#issuecomment-39026803 . MONTH type such as 2012.06M is converted into 2012-06-01 (the first day of the month). TIME, MINUTE, SECOND and NANOTIME types do not include information about date. 1970-01-01 is automatically added during conversion. For example, 13:30m is converted into 1970-01-01 13:30:00.
  • NULLs of logical, temporal and numeric types are converted into NaN or NaT; NULLs of string types are converted into empty strings. If a vector contains NULL values, the data type may change. For example, if a vector of Boolean type contains NULL values, NULL will be converted to NaN. As a result, the vector's data type will be converted to float64, and the TRUE and False values will be converted to 1 and 0, respectively.

From Python to DolphinDB

Python FormDolphinDB Form
boolBOOL
int8CHAR
int16SHORT
int32INT
int64LONG
float32FLOAT
float64DOUBLE
StringSTRING
datetime64[M]MONTH
datetime64[D]DATE
datetime64[m]MINUTE
datetime64[s]DATETIME
datetime64[h]DATEHOUR
datetime64[ms]TIMESTAMP
datetime64[us]NANOTIMESTAMP
datetime64[ns]NANOTIMESTAMP
Tuplevector
Listvector
Dictionarydictionary
Setset
NumPy.arrayvector (1d) /matrix (2d)
pandas.DataFrametable
  • The numpy.array will be converted to DolphinDB vector (1d) or matrix (2d) based on its dimension.
  • As the only temporal data type in Python pandas is datetime64, all temporal columns of a DataFrame are converted into NANOTIMESTAMP type
  • When a pandas.DataFrame is converted to DolphinDB table, if the column name is not supported in DolphinDB, it will be adjusted based on the following rules:
    • If special characters except for letters, digits or underscores are contained in the column names, they are converted to underscores.
    • If the first character is not a letter, "c" is added as the first character of the column name.