Explicit Type Conversion
When uploading a pandas.DataFrame to DolphinDB using the upload
method, some DolphinDB data types cannot be directly mapped from Python. Types like UUID, IPADDR, and SECOND do not have exact Python equivalents.
Starting in DolphinDB Python API version 1.30.22.1, explicit type conversion is supported. You can specify a __DolphinDB_Type__
attribute on the pandas.DataFrame, instructing how columns should be handled. The DolphinDB_Type attribute is a dictionary: the keys are column names and the values are the DolphinDB data types to convert those columns to.
Example (without explicit type conversion)
import dolphindb as ddb
import pandas as pd
import numpy as np
s = ddb.Session()
s.connect("localhost", 8848, "admin", "123456")
df = pd.DataFrame({
'cint': [1, 2, 3],
'csymbol': ["aaa", "bbb", "aaa"],
'cblob': ["a1", "a2", "a3"],
})
s.upload({"df_wrong": df})
print(s.run("schema(df_wrong)")['colDefs'])
Output:
name typeString typeInt extra comment
0 cint LONG 5 NaN
1 csymbol STRING 18 NaN
2 cblob STRING 18 NaN
As explained in PROTOCOL_DDB, if df
is uploaded without explicit type conversion, the "cint" column (dtype int64) will be converted to LONG in DolphinDB. The columns "csymbol" and "cblob" will be converted to STRING type in DolphinDB.
Import dolphindb.settings
. Specify the __DolphinDB_Type__
attribute on the pandas.DataFrame with a dictionary. The keys are the column names.
import dolphindb.settings as keys
df.__DolphinDB_Type__ = {
'cint': keys.DT_INT,
'csymbol': keys.DT_SYMBOL,
'cblob': keys.DT_BLOB,
}
s.upload({"df_true": df})
print(s.run("schema(df_true)")['colDefs'])
Output:
name typeString typeInt extra comment
0 cint INT 4 NaN
1 csymbol SYMBOL 17 NaN
2 cblob BLOB 32 NaN
Now all columns of the pandas.DataFrame are converted to the specified data type.
Starting DolphinDB Python API 1.30.22.4, explicit type conversions to Decimal32 and Decimal64 support specifying scale. For example:
from decimal import Decimal
df = pd.DataFrame({
'decimal32': [Decimal("NaN"), Decimal("1.22")],
'decimal64': [Decimal("1.33355"), Decimal("NaN")],
})
df.__DolphinDB_Type__ = {
'decimal32': [keys.DT_DECIMAL32, 2],
'decimal64': [keys.DT_DECIMAL64, 5],
}
s.upload({'df': df})
print(s.run("schema(df)")['colDefs'])
print('-' * 30)
print(s.run("df"))
Output:
name typeString typeInt extra comment
0 decimal32 DECIMAL32(2) 37 2
1 decimal64 DECIMAL64(5) 38 5
------------------------------
decimal32 decimal64
0 NaN 1.33355
1 1.22 NaN