createDistributedInMemoryTable
Syntax
createDistributedInMemoryTable(tableName, colNames, colTypes,
globalPartitionType, globalPartitionScheme, globalPartitionColumn,
[localPartitionType], [localPartitionScheme], [localPartitionColumn])
Details
Create a distributed in-memory table and store it on different nodes based on the specified partitioning scheme. This function can only be executed on a data node or compute node in cluster mode.
A distributed in-memory table is stored on different nodes (data nodes or compute nodes) based on the global partitioning scheme. Each node can only hold one partition. Concurrent reads and writes are supported in a distributed in-memory table. Therefore, it is suitable for scenarios that require distributed computing on in-memory tables.
A distributed in-memory table can be partitioned on 2 different levels:
-
global (across nodes): a distributed in-memory table is partitioned and the partitions are stored on different nodes. The number of global partitions must be greater than or equal to 2, but no greater than the total number of data nodes and compute nodes.
-
local (within node, optional): data within each node can be partitioned again. If both global and local partitioning are adopted, the partitioning scheme of the distributed in-memory table is equivalent to a composite domain.
Note:
- To access the distributed in-memory table from other nodes, use function loadDistributedInMemoryTable first to load the table.
- To drop a distributed in-memory table, use command dropDistributedInMemoryTable.
- Transactions are not supported currently for distributed in-memory tables.
Arguments
tableName is a STRING scalar indicating the name of a distributed in-memory table.
colNames is a STRING vector of column names.
colTypes is a vector indicating data types of columns specified by colNames.
globalPartitionType is a required parameter indicating the partition type for a table in a cluster, which only supports RANGE, HASG, VALUE, and LIST.
globalPartitionScheme is a required parameter indicating the partitioning scheme that describes how the partitions are created in a cluster.
The partition type and partitioning scheme are shown as follows:
Partition Type | Partition Type Symbol | Partitioning Scheme |
---|---|---|
range domain | RANGE | A vector. Any two adjacent elements of the vector define the range for a partition. |
hash domain | HASH | A tuple. The first element is the data type of partitioning column. The second element is the number of partitions. |
value domain | VALUE | A vector. Each element of the vector defines a partition. |
list domain | LIST | A vector. Each element of the vector defines a partition. |
globalPartitionColumn is a required parameter. It is a STRING scalar indicating the partitioning column for a table in a cluster.
The above parameters are specified for a cluster. The distributed in-memory table is partitioned and the partitions are stored on each node based on the above partitioning schemes.
The data within each node can be partitioned again through the following parameters.
localPartitionType (optional) indicates the partition type within a node, which only supports RANGE, HASG, VALUE, and LIST.
localPartitionScheme (optional) indicates the partitioning scheme that describes how the partitions are created within a node.
localPartitionColumn (optional) is a STRING scalar indicating the partitioning column within a node.
Examples
Create a distributed in-memory table.
The cluster in this example has two data nodes: node1 and node2. Create a distributed in-memory table on node1. The number of partitions should be less than the total number of data nodes and compute nodes, thus HASH partitioning is recommended.
pt = createDistributedInMemoryTable(`dt, `time`id`value, `DATETIME`INT`LONG, HASH, [INT, 2],`id)
Load a distributed in-memory table on node2 and insert data into it.
time = take(2021.08.20 00:00:00..2021.08.30 00:00:00, 40);
id = 0..39;
value = rand(100, 40);
tmp = table(time, id, value);
pt = loadDistributedInMemoryTable(`dt)
pt.append!(tmp);
select * from pt;
Check whether a distributed in-memory table exists.
objs(true)
Delete a distributed in-memory table.
dropDistributedInMemoryTable(`dt)