Feather

Feather uses the Apache Arrow columnar memory format for data, which is organized for efficient analytic operations. The DolphinDB feather plugin supports efficient import and export of Feather files with automatic data type conversion. This plugin uses the read-write interface for Feather of the Arrow open-source library.

Installation (with installPlugin)

Required server version: DolphinDB 2.00.10 or higher

Supported OS: Windows x86-64 and Linux x86-64

Installation Steps:

(1) Use listRemotePlugins to check plugin information in the plugin repository.

Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.

login("admin", "123456")
listRemotePlugins()

(2) Invoke installPlugin for plugin installation

installPlugin("feather")

(3) Use loadPlugin to load the plugin before using the plugin methods.

loadPlugin("feather")

Method References

extractSchema

Syntax

extractSchema(filePath)

Details

Get the schema of a Feature file and return a table containing the following three columns:

  1. name: Column names
  2. type: Data type of Arrow
  3. DolphinDBType: Data type of DolphinDB

Note: If the value of a cell in column DolphinDBType is VOID, it indicates that the corresponding data type in Arrow cannot be converted.

Parameters

  • filePath: A STRING scalar indicating the Feather file path.

Examples

feather::extractSchema("path/to/data.feather");
feather::extractSchema("path/to/data.compressed.feather");

load

Syntax

load(filePath, [columns])

Details

Load a Feather file to a DolphinDB in-memory table. Regarding data type conversion, see Data Type Mappings.

Note:

  • Since the minimum of DolphinDB integral type is a NULL character, the minimum of Arrow int8, int16, int32, and int64 cannot be imported into DolphinDB.
  • The infinities and NaNs (not a number) of floating-point numbers are converted to NULL values in DolphinDB.

Parameters

  • filePath: A STRING scalar indicating the Feather file path.
  • columns (optional): A STRING vector indicating the name of the columns to be loaded.

Examples

table = feather::load("path/to/data.feather");
table_part = feather::load("path/to/data.feather", [ "col1_name","col2_name"]);

save

Syntax

save(table, filePath, [compressMethod], [compressionLevel])

Details

Export a DolphinDB table to a Feather file. Regarding data type conversion, see Data Type Mappings.

Parameters

  • table: The table to be exported.
  • filePath: A STRING scalar indicating the Feather file path.
  • compression (optional): A STRING scalar indicating the following three compression methods: "uncompressed", "lz4", and "zstd" (case insensitive). The default is "lz4".
  • compressionLevel (optional): An integer specifying the compression level. It is only effective when the parameter compression is set to "zstd".

Examples

feather::save(table, "path/to/save/data.feather");
feather::save(table, "path/to/save/data.feather", "lz4");
feather::save(table, "path/to/save/data.feather", "zstd", 2);

Data Type Mappings

Import

The following is the data type mappings when a Feather file is imported to DolphinDB:

ArrowDolphinDB
boolBOOL
int8CHAR
uint8SHORT
int16SHORT
uint16INT
int32INT
uint32LONG
int64LONG
uint64LONG
floatFLOAT
doubleDOUBLE
stringSTRING
date32DATE
date64TIMESTAMP
timestamp(ms)TIMESTAMP
timestamp(ns)NANOTIMESTAMP
time32(s)SECOND
time32(ms)TIME
time64(ns)NANOTIME

The following Arrow types are not supported for conversion: binary, fixed_size_binary, half_float, timestamp(us), time64(us), interval_months, interval_day_time, decimal128, decimal, decimal256, list, struct, sparse_union, dense_union, dictionary, map, extension, fixed_size_list, large_string, large_binary, large_list, interval_month_day_nano, max_id.

Export

The following is the data type mappings when exporting data from DolphinDB to a Feather file:

DolphinDBArrow
BOOLbool
CHARint8
SHORTint16
INTint32
LONGint64
DATEdate32
TIMEtime32(ms)
SECONDtime32(s)
TIMESTAMPtimestamp(ms)
NANOTIMEtime64(ns)
NANOTIMESTAMPtimestamp(ns)
FLOATfloat
DOUBLEdouble
STRINGstring
SYMBOLstring

The following DolphinDB data types are not supported for conversion: MINUTE, MONTH, DATETIME, UUID, FUNCTIONDEF, HANDLE, CODE, DATASOURCE, RESOURCE, ANY, COMPRESS, ANY DICTIONARY, DATEHOUR, IPADDR, INT128, BLOB, COMPLEX, POINT, DURATION.

Note:

You may encounter some problems when reading Feather files using Python.

Scenario 1: The error Value XXXXXXXXXXXXX has non-zero nanoseconds is raised when reading the Feather file that contains data of type time64(ns) using pyarrow.feather.read_feather(). When a table is converted to a DataFrame, the time64(ns) type is converted to the datetime.time type, which does not support temporal data in nanoseconds.

Solution: It is recommended to read with function pyarrow.feather.read_table().

Scenario 2: Use pyarrow.feather.read_feather() to read Feather files that contain null integer columns will convert the integer columns to floating point types.

Solution: It is recommended to read Feather files into the pyarrow table and convert the data type by specifying types_mapper.

pa_table = feather.read_table("path/to/feather_file")
df = pa_table.to_pandas(types_mapper={pa.int64(): pd.Int64Dtype()}.get)