HBase

The DolphinDB hbase plugin can establish a connection to HBase via Thrift and load data from the HBase database.

Recommended version:

  • HBase: version 1.2.0
  • Thrift: version 0.14.0

Installation (with installPlugin)

Required server version: DolphinDB 2.00.10.8 or higher

Supported OS: Linux

Installation Steps:

(1) Use listRemotePlugins to check plugin information in the plugin repository.

Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.

login("admin", "123456")
listRemotePlugins()

(2) Invoke installPlugin for plugin installation

installPlugin("hbase")

(3) Use loadPlugin to load the plugin before using the plugin methods.

loadPlugin("hbase")

Start Thrift server:

Run the following command to start the Thrift server with port specified as 9090:

$HBASE_HOME/bin/hbase-daenom.sh start thrift -p 9090

You can close the Thrift server with the following command:

$HBASE_HOME/bin/hbase-daemon.sh stop thrift

Method References

connect

Syntax

connect(host, port, [isFramed])

Details

Build a connection to HBase via Thrift server and return an HBase handle.

Parameters

  • host: A STRING scalar indicating the server address to connect to.
  • port: An integer indicating the port number of the Thrift server.
  • isFramed: A BOOLEAN scalar indicating the way of data transmission. If set to false (default), data is transmitted through TBufferedTransport; if set to true, data is transmitted through TFramedTransport.

Examples

conn = hbase::connect("192.168.1.114",9090)

Note: If the connection remains inactive for a while (1min by default), HBase will automatically close it. If you try to operate through this connection, the No more data to read error will be reported. In such a case, you have to execute hbase::connect to reconnect.

You can configure with hbase.thrift.server.socket.read.timeout and hbase.thrift.connection.max-idletime to change the timeout.

The following configuration parameters change the timeout to 1 day.

<property>
         <name>hbase.thrift.server.socket.read.timeout</name>
         <value>86400000</value>
         <description>eg:milisecond</description>
</property>
<property>
         <name>hbase.thrift.connection.max-idletime</name>
         <value>86400000</value>
</property>

showTables

Syntax

showTables(hbaseConnection)

Details

Return all table names of the connected database.

Parameters

  • hbaseConnection: The handle returned by hbase::connect.

Examples

conn = hbase::connect("192.168.1.114", 9090)
hbase::showTables(conn)

deleteTable

Syntax

deleteTable(hbaseConnection, tableName)

Details

Delete the tables in the database.

Parameters

  • hbaseConnection: The handle returned by hbase::connect.
  • tableName: A STRING scalar/vector indicating the name of the table to be deleted.

Examples

conn = hbase::connect("192.168.1.114", 9090)
hbase::deleteTable(conn, "demo_table")

getRow

Syntax

getRow(hbaseConnection, tableName, rowKey, [columnName])

Details

Return the specific record with rowKey.

Parameters

  • hbaseConnection: The handle returned by hbase::connect.
  • tableName: A STRING scalar indicating the name of the table to be read.
  • rowKey: A STRING scalar indicating the index of the row to be read.
  • columnName: A STRING scalar/vector indicating the name of the column to be read. If not specified, all columns are read by default.

Examples

conn = hbase::connect("192.168.1.114", 9090)
hbase::getRow(conn, "test", "row1")

load

Syntax

load(hbaseConnection, tableName, [schema])

Details

Import the HBase results into a DolphinDB in-memory table. The data types supported for schema are described in "Data Type Mappings".

Parameters

  • hbaseConnection: The handle returned by hbase::connect.
  • tableName: A STRING scalar indicating the name of the table to be loaded.
  • schema (optional): If specified, it is a table containing names of the columns to be imported and their data types. The column names specified in schema must be consistent with the HBase column names. If not specified, the table will be created based on the first row with each column of STRING type. Note that each row must have the same size.

Examples

conn = hbase::connect("192.168.1.114", 9090)
t =  table(["cf:a","cf:b", "cf:c", "cf:time"] as name, ["STRING", "INT", "FLOAT", "TIMESTAMP"] as type)
t1 = hbase::load(conn, "test", t)

Data Type Mappings

The following is the data type mappings when an HBase table is imported to DolphinDB. Data stored in HBase must conform to the types specified in the table below, otherwise NULL values will be returned.

TypeHBaseDolphinDB
BOOLtrue, 1, FALSEtrue, true, false
CHARaa
SHORT11
INT2121
LONG112112
FLOAT1.21.2
DOUBLE3.53.5
SYMBOLs0"s0"
STRINGname"name"
DATE20210102, 2021.01.022021.01.02, 2021.01.02
MONTH201206, 2012.122012.06M, 2021.12M
TIME052013140, 05:20:01.99905:20:13.140, 05:20:01.999
MINUTE1230, 13:3012:30m, 13:30m
SECOND123010, 13:30:1012:30:10, 13:30:10
DATETIME20120613133010, 2012.06.13 13:30:10, 2012.06.13T13:30:102012.06.13T13:30:10, 2012.06.13T13:30:10, 2012.06.13T13:30:10
TIMESTAMP20210218051701000, 2012.06.13 13:30:10.008, 2012.06.13T13:30:10.0082021.02.18T05:17:01.000, 2012.06.13T13:30:10.008, 2012.06.13T13:30:10.008
NANOTIME133010008007006, 13:30:10.00800700613:30:10.008007006, 13:30:10.008007006
NANOTIMESTAMP20120613133010008007006, 2012.06.13 13:30:10.008007006, 2012.06.13T13:30:10.0080070062012.06.13T13:30:10.008007006, 2012.06.13T13:30:10.008007006, 2012.06.13T13:30:10.008007006