# enableTablePersistence {#enabletablepersistence}

## Syntax {#syntax}

`enableTablePersistence(table, [asynWrite=true], [compress=true], [cacheSize], [retentionMinutes=1440], [flushMode=0],[cachePurgeTimeColumn],[cachePurgeInterval],[cacheRetentionTime],[preCache])`

## Arguments {#arguments}

**table** is an empty shared stream table.

**asynWrite** \(optional\) is a Boolean value indicating whether persistence is enabled in asynchronous mode. The default value is true, meaning asynchronous persistence is enabled. In this case, once data is written into memory, the write is deemed complete. The data stored in memory is then persisted to disk by another thread.

**compress** \(optional\) is a Boolean value indicating whether to save a table to disk in compression mode. The default value is true.

**cacheSize** \(optional\) is an integer used to determine the maximum number of records to retain in memory. If set to 0 or not specified, all records will be retained. Any positive integer smaller than 1000 will automatically be adjusted to 1000.

**retentionMinutes** \(optional\) is an integer indicating for how long \(in minutes\) a log file larger than 1GB will be kept after last update. The default value is 1440, which means the log file is kept for 1440 minutes, i.e., 1 day.

**flushMode** \(optional\) is an integer indicating whether to enable synchronous disk flush. It can be 0 or 1. The persistence process first writes data from memory to the page cache, then flushes the cached data to disk. If *flushMode* is 0 \(default\), asynchronous disk flushing is enabled. In this case, once data is written from memory to the page cache, the flush is deemed complete and the next batch of data can be written to the table. If *flushMode* is set to 1, the current batch of data must be flushed to disk before the next batch can be written.

**cachePurgeTimeColumn** \(optional\) is a STRING scalar indicating the time column in the stream table.

**cachePurgeInterval** \(optional\) is a DURATION scalar indicating the interval to trigger cache purge.

**cacheRetentionTime** \(optional\) is a DURATION scalar indicating the retention time of cached data.

**preCache** \(optional\) is an integer indicating the number of records to load into memory from the persisted stream table at server startup. If it is not specified, all records are loaded into memory.

Note: Since version 3.00.2/2.00.14, *cacheRetentionTime* must be smaller than *cachePurgeInterval*.

## Details {#details}

This command enables a shared stream table to be persisted to disk.

For this command to work, we need to specify the configuration parameter *persistenceDir* in the configuration file \(*dolohindb.cfg* in standalone mode and *cluster.cfg* in cluster mode\). For details of this configuration parameter, see [Reference](../../Database/Configuration/reference.md). The persistence location of the table is *&lt;PERSISTENCE\_DIR&gt;/&lt;TABLE\_NAME&gt;*. The directory contains 2 types of files: data files \(named like *data0.log*, *data1.log*...\) and an index file *index.log*. The data that has been persisted to disk will be loaded into memory after the system is restarted.

The parameter *asynWrite* informs the system whether table persistence is in asynchronous mode. With asynchronous mode, new data are pushed to a queue and persistence workers \(threads\) will write the data to disk later. With synchronous mode, the table append operation keeps running until new data are persisted to the disk. The default value is true \(asynchronous mode\). In general, asynchronous mode achieves higher throughput.

With asynchronous mode, table persistence is conducted by a single persistence worker \(thread\), and the persistence worker may handle multiple tables. If there is only one table to be persisted, an increase in the number of persistence workers doesn't improve performance.

Stream tables keep all data in memory by default. To prevent excessive memory usage, you can clear cached data using either of the following methods:

-   **Cache purge by size**: Set *cacheSize*to specify a threshold for the number of records retained. Older records exceeding the threshold will be removed. The threshold is determined as follows:
    -   If the number of records appended in one batchdoes not exceed *cacheSize*, the threshold is 2.5 \* *cacheSize*.
    -   If the number of records appended in one batch exceeds *cacheSize*, the threshold is 1.2 \* \(appended records + *cacheSize*\).
-   **Cache purge by time**: Set *cachePurgeTimeColumn*, *cachePurgeInterval* and *cacheRetentionTime.* The system will clean up data based on the *cachePurgeTimeColumn*. Each time when a new record arrives, the system obtains the time difference between the new record and the oldest record kept in memory. If the time difference exceeds *cachePurgeInterval*, the system will retain only the data with timestamps within *cacheRetentionTime*of the new data.

**Note:**

-   It is recommended to invoke command [fflush](../f/fflush.md) to write data in the page cache to disk before you terminate a DolphinDB process \(with `kill -15`\) and restart it.

-   If asynchronous mode is enabled for data persistence or flush, data loss may occur due to server crash.


## Examples {#examples}

Example 1:

```
colName=["time","x"]
colType=["timestamp","int"]
t = streamTable(100:0, colName, colType);
share t as st

enableTablePersistence(table=st, cacheSize=1200000)
```

```
for(s in 0:200){
    n=10000
    time=2019.01.01T00:00:00.000+s*n+1..n
    x=rand(10.0, n)
    insert into st values(time, x)
}
```

```
getPersistenceMeta(st);

/* output
persistenceDir->/data/ssd/DolphinDBDemo/persistence3/st
retentionMinutes->1440
hashValue->0
asynWrite->true
diskOffset->0
sizeInMemory->800000
compress->1
memoryOffset->1200000
totalSize->2000000
sizeOnDisk->2000000
*/
```

Please note that in this example, we shared a stream table before persisting it with the command `enableTablePersistence`. These 2 operations can be achieved with command [enableTableShareAndPersistence](enableTableShareAndPersistence.md).

Example 2: Illustrate how to use *cachePurgeTimeColumn*, *cachePurgeInterval*, and *cacheRetentionTime*.

``` {#codeblock_st5_ndt_xbc}
colName=["time","x"]
colType=["timestamp","int"]
t1 = streamTable(100:0, colName, colType);
share t1 as st1

enableTablePersistence(table=st1,cachePurgeTimeColumn = `time, cachePurgeInterval = duration("7H"),cacheRetentionTime = duration("2H"))
go;

time=2019.01.01T00:00:00.000
for(s in 0:6000){
  time = temporalAdd(time,1,"m");
  x=rand(10.0, 1)
  insert into st1 values(time, x)
}

getPersistenceMeta(st1);

//Check the stream table metadata. The tables contains 6000 records in total, and only 300 records are retained in memory after purging the cache.

/* output:
lastLogSeqNum->-1
sizeInMemory->300
totalSize->6000
asynWrite->true
compress->true
raftGroup->-1
memoryOffset->5700
retentionMinutes->1440
sizeOnDisk->5973
persistenceDir->/home/ffliu/jjxu/DolphinDB_Linux64_V3.0/server/persistence/st1
hashValue->0
diskOffset->0
*/
```

Related commands: [disableTablePersistence](../d/disableTablePersistence.md), [clearTablePersistence](../c/clearTablePersistence.md), [enableTablePersistence](enableTablePersistence.md)

