Catalog
DolphinDB 3.0 has introduced the catalog feature exclusively for DFS databases to provide users with a more convenient and standardized database access experience, as well as facilitate integration with third-party software. Catalogs allow users to uniformly manage databases and tables of varying partitioning schemes, and access them using standard SQL syntax.
Concepts
Syntax
The syntax for referencing tables within a catalog is as follows:
<catalog>.<schema>.<table>[@<cluster_identifier>]
where
- catalog is a STRING scalar indicating the catalog name.
- schema is a STRING scalar indicating the schema name.
- table is a STRING scalar indicating the table name.
- cluster_identifier (optional) is a STRING scalar indicating the cluster name (configured by clusterName).
Note: In contrast to tables within a catalog that can be referenced directly
with <catalog>.<schema>.<table>
, tables within a database
must first be loaded with loadTable
before performing any
subsequent operations.
DolphinDB's catalog consists of the following three hierarchies:
-
Catalog: The highest level of organization within the database management system (DBMS). A catalog holds one or more schemas, representing the complete set of schemas that a user can access.
-
Schema: The logical container for table objects. In DolphinDB, a schema corresponds to a database in the Database > Table structure. Note that this concept of schema should be distinguished from the table schema which refers to the structure of the table.
-
Table: The primary component within a schema. Tables organize information into rows and columns.
Note: When using SQL statements for table operations, tables within a catalog can be
referenced by specifying catalog.schema.table
. In contrast, tables
within a database must first be loaded with loadTable
before
performing any subsequent operations.
Operations on Catalogs
Working with Catalogs
The name of a catalog or schema can contain letters, digits and underscores and must begin with a letter. The name is case-insensitive when referenced in statements or functions.
(1) Creating catalogs
To create a catalog, use function createCatalog
. For example,
create a catalog "trading":
createCatalog("trading")
To switch to a specific catalog, execute use
statement and
specify the catalog
keyword. For example, switch to catalog
"trading":
use catalog trading;
Once a catalog is used, subsequent operations would use the catalog by default unless a different catalog is explicitly specified.
(2) Creating schemas
To create a schema, use create database
statement and specify
catalog.schema instead of directory.
Syntax
create database catalog.schema partitioned by partitionType(partitionScheme),[partitionType(partitionScheme),partitionType(partitionScheme)],
[engine='OLAP'], [atomic='TRANS'], [chunkGranularity='TABLE']
For example, create a schema "stock" using VALUE partitions and OLAP storage engine.
create database stock partitioned by VALUE(`IBM`MSFT`GM`C`YHOO`GOOG), engine='OLAP'
Since section (1) already specifies a default catalog "trading",
stock
in the above script is equivalent to
trading.stock
. If no catalog is used, the creation would
fail and an error of "The catalog doesn't exist" will be thrown.
To add an existing database to a catalog, use createSchema
. For
example, add schema "stock2" referencing database "dfs://db1" to the catalog
"trading".
database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 5 10) //通过 database 函数创建的 db1
createSchema("trading", "dfs://db1", "stock2")
(3) Creating tables
To create a table in a schema, use create table
statement and
specify catalog.schema.tableName instead of dbPath.
Syntax
create table catalog.schema.tableName(
schema[columnDescription]
)
[partitioned by partitionColumns],
[sortColumns],
[keepDuplicates=ALL],
[sortKeyMappingFunction]
For example:
create table stock.quote (
id INT,
date DATE[comment="time_col", compress="delta"],
value DOUBLE,
)
partitioned by id
Since section (1) already specifies a default catalog "trading",
stock.quote
in the above script is equivalent to
trading.stock.quote
.
(4) Dropping catalogs
To drop a catalog, call function dropCatalog
. For example:
dropCatalog("catalog1")
(5) Dropping schemas from catalogs
To drop a schema from a catalog, use drop
statement.
Syntax
drop table [if exists] catalog.schema
where catalog.schema specifies the schema to be dropped. If a default catalog is already used, the catalog can be omitted.
(6) Dropping tables from schemas
To drop a table from a schema, use drop
statement.
Syntax
drop table [if exists] catalog.schema.tableName
where catalog.schema.tableName specifies the table to be dropped. If a default catalog is already used, the catalog can be omitted.
SQL Examples
The following SQL statements can be used when working with catalogs.
Statement | Type | Description |
---|---|---|
create | DDL | Create schemas/tables |
alter | DDL | Add columns to tables |
drop | DDL | Drop schemas/tables |
update | DML | Update tables |
delete | DML | Delete records from tables |
select | DQL | Access tables |
Use the following script to generate sample data using the above created catalog "trading":
create database stock partitioned by VALUE(1..6), engine='OLAP'
database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 5 10)
createSchema("trading", "dfs://db1", "stock2")
create table stock.quote (
id INT,
date DATE[comment="time_col", compress="delta"],
value DOUBLE,
)
partitioned by id
dbUrl = exec dbUrl from getSchemaByCatalog("trading") where schema = "stock"
id = 1 2 3 4 5 6 1 2 3 4 5 6
date = 2023.01.01..2023.01.12
value = 1..12
data = table( id, date, value)
loadTable(dbUrl[0], "quote").append!(data)
select * from stock.quote
Table trading.stock.quote:
id | date | value |
---|---|---|
1 | 2023.01.01 | 1 |
1 | 2023.01.07 | 7 |
2 | 2023.01.02 | 2 |
2 | 2023.01.08 | 8 |
3 | 2023.01.03 | 3 |
3 | 2023.01.09 | 9 |
4 | 2023.01.04 | 4 |
4 | 2023.01.10 | 10 |
5 | 2023.01.05 | 5 |
5 | 2023.01.11 | 11 |
Example 1. Select data from "stock.quote" with filtering conditions:
select * from stock.quote where id = 1
id | date | value |
---|---|---|
1 | 2023.01.01 | 1 |
1 | 2023.01.07 | 7 |
Example 2. Update data of "stock.quote":
update stock.quote set value = -1.0 where id = 2
select * from stock.quote where id = 2
id | date | value |
---|---|---|
2 | 2023.01.02 | -1 |
2 | 2023.01.08 | -1 |
Example 3. Delete data from "stock.quote":
delete from stock.quote where id = 3
select * from stock.quote where id = 3 // returns an empty table
Note: If a script contains a variable with the same name as schema, the query may fail to access the schema.
For example, if a variable stock is defined and shares the same name as the schema, stock in the script will be parsed as the variable name and an error will be raised.
stock=1
select * from stock.quote where id = 1;
//getMember method not supported
Such variables can be dropped with undef
.
undef(`stock)
Management of Catalogs
(1) Checking default catalog
To view the default catalog used in the current session, call
getCurrentCatalog
.
use catalog trading;
getCurrentCatalog()
// output: "trading"
use catalog trading2;
getCurrentCatalog()
// output: "trading2"
(2) Renaming catalogs/schemas
To rename a catalog, call function renameCatalog
.
renameCatalog("trading", "trading2")
getAllCatalogs()
// output: ["trading2"]
To rename a schema, call function renameSchema
.
renameSchema("trading", "stock", "stock1")
exec schema from getSchemaByCatalog("trading")
// output: ["stock1"]
(3) Checking schema info
To get info on schemas within a catalog, call function
getSchemaByCatalog
. A schema created with create
database
has a unique identifier of dbUrl in the format
"dfs://<name_timestamp>" which is unaltered even after renamed.
getSchemaByCatalog("trading")
schema | dbUrl |
---|---|
stock | dfs://trading_stock_1712077295373 |
(4) Checking logs on catalog operations
To view info of operations on catalogs and schemas, you can check the ACL Audit log on the controller. The log message is displayed in the following format:
ACL Audit: function createSchema [catalog=trading,dbUrl=dfs://db1,schema=stock2], called by user [xxx]
User Access Control
Catalog-Level Access Control
The following access privileges with prefix "CATALOG_" can be set for databases/tables within a specific catalog.
Privilege | Description |
---|---|
CATALOG_MANAGE | MANAGE permission for operations including adding schemas, drop and rename catalogs |
CATALOG_READ | READ permission for all tables within a catalog |
CATALOG_WRITE | WRITE permission for write, update, and insert operations on all tables within a catalog |
CATALOG_INSERT | INSERT permission for all tables within a catalog |
CATALOG_UPDATE | UPDATE permission for all tables within a catalog |
CATALOG_DELETE | DELETE permission for all tables within a catalog |
The parameter objs of function
grant
/deny
/revoke
should
be the catalog name. For example, grant user Adam CATALOG_READ access on catalog
"trading".
grant("Adam", CATALOG_READ, "trading")
Schema-Level Access Control
The following access privileges with prefix "SCHEMA_" can be set for schemas.
Privilege | Description |
---|---|
SCHEMA_MANAGE | MANAGE permission for schemas |
SCHEMAOBJ_CREATE | CREATE permission for all add DDL operations on tables within a schema |
SCHEMAOBJ_DELETE | DELETE permission for all delete DDL operations on tables within a schema |
SCHEMA_READ | READ permission for all tables within a schema |
SCHEMA_WRITE | WRITE permission for write, update, and insert operations on all tables within a schema |
SCHEMA_INSERT | INSERT permission for all tables within a schema |
SCHEMA_UPDATE | UPDATE permission for all tables within a schema |
SCHEMA_DELETE | DELETE permission for all tables within a schema |
The parameter objs of function
grant
/deny
/revoke
should
be in the format of catalog.schema
. For example, grant user
Adam SCHEMA_READ access on schema "trading.stock".
grant("Adam", SCHEMA_READ, "trading.stock")
When managing permissions across multiple clusters, remember two key requirements:
- Append
@<cluster_identifier>
to usernames to specify their cluster membership; - Append
@<cluster_identifier>
to objs to indicate which cluster's resources are being accessed.
For example, to grant user 1 from cluster 2 access to the trading.stock.quote table in cluster 1, you can execute the following script on the MoM node.
grant("user1@cluster2", TABLE_READ, "trading.stock.quote@cluster1")
Appendix: Operation Reference
This section provides a comprehensive summary of SQL statements and functions related to catalog management in DolphinDB, serving as a quick reference for users.
Function/ Statement | Description |
---|---|
use |
Switch the current catalog. |
create |
Create a schema or a table. |
alter |
Add a column to an existing table. |
drop |
Delete a schema or a table. |
select |
Access data in a table. |
update |
Update records in a table. |
delete |
Delete records in a table. |
setDefaultCatalog |
Set the default catalog for the current session. |
existsCatalog |
Check if a specified catalog exists. |
createCatalog |
Create a new catalog. |
createSchema |
Add a specified database as a schema to a specified catalog. |
dropDatabase |
Delete a database and referenced schema. |
dropCatalog |
Delete a specified catalog. |
dropSchema |
Delete a specified schema within a catalog. |
getCurrentCatalog |
View the current catalog for the session. |
getAllCatalogs |
Retrieve all catalogs. |
getSchemaByCatalog |
Retrieve all schemas within a specified catalog. |
renameCatalog |
Rename a catalog. |
renameSchema |
Rename a schema. |
getUserAccess |
Query the permissions assigned to a specific user or the combined effective permissions. |
getGroupAccess |
Query the permissions of a group. |