Catalog

DolphinDB 3.0 has introduced the catalog feature exclusively for DFS databases to provide users with a more convenient and standardized database access experience, as well as facilitate integration with third-party software. Catalogs allow users to uniformly manage databases and tables of varying partitioning schemes, and access them using standard SQL syntax.

Concepts

Syntax

The syntax for referencing tables within a catalog is as follows:

<catalog>.<schema>.<table>[@<cluster_identifier>]

where

  • catalog is a STRING scalar indicating the catalog name.
  • schema is a STRING scalar indicating the schema name.
  • table is a STRING scalar indicating the table name.
  • cluster_identifier (optional) is a STRING scalar indicating the cluster name (configured by clusterName).

Note: In contrast to tables within a catalog that can be referenced directly with <catalog>.<schema>.<table>, tables within a database must first be loaded with loadTable before performing any subsequent operations.

DolphinDB's catalog consists of the following three hierarchies:

  • Catalog: The highest level of organization within the database management system (DBMS). A catalog holds one or more schemas, representing the complete set of schemas that a user can access.

  • Schema: The logical container for table objects. In DolphinDB, a schema corresponds to a database in the Database > Table structure. Note that this concept of schema should be distinguished from the table schema which refers to the structure of the table.

  • Table: The primary component within a schema. Tables organize information into rows and columns.



Note: When using SQL statements for table operations, tables within a catalog can be referenced by specifying catalog.schema.table. In contrast, tables within a database must first be loaded with loadTable before performing any subsequent operations.

Operations on Catalogs

Working with Catalogs

The name of a catalog or schema can contain letters, digits and underscores and must begin with a letter. The name is case-insensitive when referenced in statements or functions.

(1) Creating catalogs

To create a catalog, use function createCatalog. For example, create a catalog "trading":

createCatalog("trading")

To switch to a specific catalog, execute use statement and specify the catalog keyword. For example, switch to catalog "trading":

use catalog trading;

Once a catalog is used, subsequent operations would use the catalog by default unless a different catalog is explicitly specified.

(2) Creating schemas

To create a schema, use create database statement and specify catalog.schema instead of directory.

Syntax

create database catalog.schema partitioned by partitionType(partitionScheme),[partitionType(partitionScheme),partitionType(partitionScheme)],
[engine='OLAP'], [atomic='TRANS'], [chunkGranularity='TABLE']

For example, create a schema "stock" using VALUE partitions and OLAP storage engine.

create database stock partitioned by VALUE(`IBM`MSFT`GM`C`YHOO`GOOG), engine='OLAP'

Since section (1) already specifies a default catalog "trading", stock in the above script is equivalent to trading.stock. If no catalog is used, the creation would fail and an error of "The catalog doesn't exist" will be thrown.

To add an existing database to a catalog, use createSchema. For example, add schema "stock2" referencing database "dfs://db1" to the catalog "trading".

database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 5 10) //通过 database 函数创建的 db1
createSchema("trading", "dfs://db1", "stock2")

(3) Creating tables

To create a table in a schema, use create table statement and specify catalog.schema.tableName instead of dbPath.

Syntax

create table catalog.schema.tableName(
 schema[columnDescription]
)
[partitioned by partitionColumns],
[sortColumns],
[keepDuplicates=ALL],
[sortKeyMappingFunction]

For example:

create table stock.quote (
  id INT,
  date DATE[comment="time_col", compress="delta"],
  value DOUBLE,
 )
 partitioned by id

Since section (1) already specifies a default catalog "trading", stock.quote in the above script is equivalent to trading.stock.quote.

(4) Dropping catalogs

To drop a catalog, call function dropCatalog. For example:

dropCatalog("catalog1")

(5) Dropping schemas from catalogs

To drop a schema from a catalog, use drop statement.

Syntax

drop table [if exists] catalog.schema

where catalog.schema specifies the schema to be dropped. If a default catalog is already used, the catalog can be omitted.

(6) Dropping tables from schemas

To drop a table from a schema, use drop statement.

Syntax

drop table [if exists] catalog.schema.tableName

where catalog.schema.tableName specifies the table to be dropped. If a default catalog is already used, the catalog can be omitted.

SQL Examples

The following SQL statements can be used when working with catalogs.

Statement Type Description
create DDL Create schemas/tables
alter DDL Add columns to tables
drop DDL Drop schemas/tables
update DML Update tables
delete DML Delete records from tables
select DQL Access tables
Note: When using these SQL statements on catalogs/schemas, a default catalog must be used in the current session or a catalog must be specified in the statement, otherwise an error would be raised.

Use the following script to generate sample data using the above created catalog "trading":

create database stock partitioned by VALUE(1..6), engine='OLAP'
database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 5 10) 
createSchema("trading", "dfs://db1", "stock2") 

create table stock.quote (
     id INT,
     date DATE[comment="time_col", compress="delta"],
     value DOUBLE,
 )
partitioned by id
 
dbUrl = exec dbUrl from getSchemaByCatalog("trading") where schema = "stock"
id = 1 2 3 4 5 6 1 2 3 4 5 6
date = 2023.01.01..2023.01.12
value = 1..12
data = table( id, date, value)
loadTable(dbUrl[0], "quote").append!(data)
select * from stock.quote

Table trading.stock.quote:

id date value
1 2023.01.01 1
1 2023.01.07 7
2 2023.01.02 2
2 2023.01.08 8
3 2023.01.03 3
3 2023.01.09 9
4 2023.01.04 4
4 2023.01.10 10
5 2023.01.05 5
5 2023.01.11 11

Example 1. Select data from "stock.quote" with filtering conditions:

select * from stock.quote where id = 1
id date value
1 2023.01.01 1
1 2023.01.07 7

Example 2. Update data of "stock.quote":

update stock.quote set value = -1.0 where id = 2
select * from stock.quote where id = 2
id date value
2 2023.01.02 -1
2 2023.01.08 -1

Example 3. Delete data from "stock.quote":

delete from stock.quote where id = 3
select * from stock.quote where id = 3 // returns an empty table

Note: If a script contains a variable with the same name as schema, the query may fail to access the schema.

For example, if a variable stock is defined and shares the same name as the schema, stock in the script will be parsed as the variable name and an error will be raised.

stock=1
select * from stock.quote where id = 1; 
//getMember method not supported

Such variables can be dropped with undef.

undef(`stock)

Management of Catalogs

(1) Checking default catalog

To view the default catalog used in the current session, call getCurrentCatalog.

use catalog trading;
getCurrentCatalog() 
// output: "trading"

use catalog trading2;
getCurrentCatalog() 
// output: "trading2"

(2) Renaming catalogs/schemas

To rename a catalog, call function renameCatalog.

renameCatalog("trading", "trading2") 
getAllCatalogs() 
// output: ["trading2"]

To rename a schema, call function renameSchema.

renameSchema("trading", "stock", "stock1") 
exec schema from getSchemaByCatalog("trading") 
// output: ["stock1"]

(3) Checking schema info

To get info on schemas within a catalog, call function getSchemaByCatalog. A schema created with create database has a unique identifier of dbUrl in the format "dfs://<name_timestamp>" which is unaltered even after renamed.

getSchemaByCatalog("trading")
schema dbUrl
stock dfs://trading_stock_1712077295373

(4) Checking logs on catalog operations

To view info of operations on catalogs and schemas, you can check the ACL Audit log on the controller. The log message is displayed in the following format:

ACL Audit: function createSchema [catalog=trading,dbUrl=dfs://db1,schema=stock2], called by user [xxx]

User Access Control

Catalog-Level Access Control

The following access privileges with prefix "CATALOG_" can be set for databases/tables within a specific catalog.

Privilege Description
CATALOG_MANAGE MANAGE permission for operations including adding schemas, drop and rename catalogs
CATALOG_READ READ permission for all tables within a catalog
CATALOG_WRITE WRITE permission for write, update, and insert operations on all tables within a catalog
CATALOG_INSERT INSERT permission for all tables within a catalog
CATALOG_UPDATE UPDATE permission for all tables within a catalog
CATALOG_DELETE DELETE permission for all tables within a catalog

The parameter objs of function grant/deny/revoke should be the catalog name. For example, grant user Adam CATALOG_READ access on catalog "trading".

grant("Adam", CATALOG_READ, "trading")

Schema-Level Access Control

The following access privileges with prefix "SCHEMA_" can be set for schemas.

Privilege Description
SCHEMA_MANAGE MANAGE permission for schemas
SCHEMAOBJ_CREATE CREATE permission for all add DDL operations on tables within a schema
SCHEMAOBJ_DELETE DELETE permission for all delete DDL operations on tables within a schema
SCHEMA_READ READ permission for all tables within a schema
SCHEMA_WRITE WRITE permission for write, update, and insert operations on all tables within a schema
SCHEMA_INSERT INSERT permission for all tables within a schema
SCHEMA_UPDATE UPDATE permission for all tables within a schema
SCHEMA_DELETE DELETE permission for all tables within a schema

The parameter objs of function grant/deny/revoke should be in the format of catalog.schema. For example, grant user Adam SCHEMA_READ access on schema "trading.stock".

grant("Adam", SCHEMA_READ, "trading.stock")
Note:

When managing permissions across multiple clusters, remember two key requirements:

  • Append @<cluster_identifier> to usernames to specify their cluster membership;
  • Append @<cluster_identifier> to objs to indicate which cluster's resources are being accessed.

For example, to grant user 1 from cluster 2 access to the trading.stock.quote table in cluster 1, you can execute the following script on the MoM node.

grant("user1@cluster2", TABLE_READ, "trading.stock.quote@cluster1")

Appendix: Operation Reference

This section provides a comprehensive summary of SQL statements and functions related to catalog management in DolphinDB, serving as a quick reference for users.

Function/ Statement Description
use Switch the current catalog.
create Create a schema or a table.
alter Add a column to an existing table.
drop Delete a schema or a table.
select Access data in a table.
update Update records in a table.
delete Delete records in a table.
setDefaultCatalog Set the default catalog for the current session.
existsCatalog Check if a specified catalog exists.
createCatalog Create a new catalog.
createSchema Add a specified database as a schema to a specified catalog.
dropDatabase Delete a database and referenced schema.
dropCatalog Delete a specified catalog.
dropSchema Delete a specified schema within a catalog.
getCurrentCatalog View the current catalog for the session.
getAllCatalogs Retrieve all catalogs.
getSchemaByCatalog Retrieve all schemas within a specified catalog.
renameCatalog Rename a catalog.
renameSchema Rename a schema.
getUserAccess Query the permissions assigned to a specific user or the combined effective permissions.
getGroupAccess Query the permissions of a group.