Empyrical Module for Risk and Performance Metrics

The Empyrical library is an open-source Python library developed by Quantopian, specifically designed for calculating commonly used financial risk and performance attribution metrics. It includes a variety of tools for strategy backtesting analysis and can be used to compute metrics such as annualized return, maximum drawdown, alpha, beta, Calmar ratio, Omega ratio, and Sharpe ratio. To facilitate the calculation of these metrics in DolphinDB, we have implemented the metric functions from the Empyrical library using DolphinDB scripts and encapsulated them in the DolphinDB Empyrical module.

1. Naming and Parameter Conventions

All function names in the Empyrical module follow the convention of using the specific function's purpose as its name, such as simpleReturns, cumReturns, cumMaxDrawdown, calmarRatio, sharpeRatio, rollSharpeRatio, alpha, etc.

All fields involved in this tutorial are as follows:


Parameter / Field	Description
prices	Closing prices
returns	Daily simple (non-cumulative) returns
factorReturns	Benchmark returns / factor returns for comparison
factorLoadings	Factor loadings

Note:

In the perfAttrib function, factorReturns refers to the factor returns of different factors and is input as a multi-column table.
The returns parameter also accepts a benchmark return vector as the input.
Since the beta function conflicts with a built-in function in DolphinDB, the corresponding function in Empyrical is named covarBeta (as Empyrical calculates beta using covariance / variance, whereas DolphinDB’s beta refers to the least squares estimate of Y regressed on X).
In DolphinDB, the maxDrawdown function calculates the maximum difference from a peak. In the Empyrical library, the function is defined as the cumulative maximum relative drawdown. Due to the naming conflict, the function that calculates maximum drawdown in Empyrical is named cumMaxDrawdown.
The function valueAtRisk has the same meaning in both DolphinDB and Empyrical. In the Empyrical library, we use the historical simulation method to compute Value at Risk. Due to the naming conflict, the corresponding function in Empyrical is named hisValueAtRisk.

2. Usage Examples

This chapter introduces the usage of Empyrical module, including environment setup, data preparation and function calls.

2.1 Environment Setup

Place the attached Empyrical.dos file in the [home]/modules directory. The [home] directory is set by the system configuration parameter home, which can be viewed using the getHomeDir() function.

For details on module usage, see Tutorial > Modules.

2.2 Direct Function Call

Call the sharpeRatio function on a vector:

use Empyrical
ret = 0.072 0.0697 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63
x = sharpeRatio(ret);

2.3 Grouped Calculation in SQL Statements

To perform grouped calculations, we can call the function in the SQL statement.

For example, the table t contains data for 2 stocks:

close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63 3.81 3.935 4.04 3.74 3.7 3.33 3.64 3.31 2.69 2.72
date = (2023.03.02 + 0..4 join 7..11).take(20)
symbol = take(`F,10) join take(`GPRO,10)
t = table(symbol, date, close)

Calculate the return for each stock:

update t set ret = simpleReturns(close) context by symbol

2.4 Functions With Multiple Returns

Functions such as alphaBeta return multiple columns of results. For example:

use Empyrical

ret = 0.072 0.0697 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63
factorret = 0.702 0.97 0.708 1.74 0.49 0.09 1.26 0.59 1.35 0.063
alpha, beta = alphaBeta(ret, factorret);

Use in a SQL statement:

use Empyrical

ret = 0.072 0.0697 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63 0.702 0.97 0.708 1.74 0.49 0.09 1.26 0.59 1.35 0.063 
factorret = 0.702 0.97 0.708 1.74 0.49 0.09 1.26 0.59 1.35 0.063 0.072 0.0697 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63
symbol = take(`F,10) join take(`GPRO,10)
date = (2022.03.02 + 0..4 join 7..11).take(20)
t = table(symbol as sym, date as dt, ret, factorret) 
select sym,alphaBeta(ret, factorret) as `alpha`beta from t context by sym

/*
sym  alpha                beta              
---- -------------------- ------------------
F    2.936093190579921E62 -0.276883617899559
F    2.936093190579921E62 -0.276883617899559
...   
*/

3. Function Performance Evaluation

This section uses the rollSharpeRatio function as an example to conduct a performance comparison of direct function call, and also provides the performance comparison of all functions in grouped use in SQL statements using real daily stock return data.

3.1 Direct Function Call

use Empyrical

ret= 0.072 0.0697 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63
ret = take(ret, 10000000)
timer x = rollSharpeRatio(ret, windows =10)

Applying the rollSharpeRatio function from the Empyrical module directly on a vector of length 10,000,000 took 300.135 ms.

The corresponding Python code is as follows:

import numpy as np
import empyrical as em
import time
ret = np.array([0.072, 0.0697, 0.08, 0.74, 1.49, 0.9, 0.26, 0.9, 0.35, 0.63])
ret = np.tile(ret,10000000)
start_time = time.time()
x = em.roll_sharpe_ratio(ret, 10)
print("--- %s seconds ---" % (time.time() - start_time))

The roll_sharpe_ratio function from the Python Empyrical library took 25,965.46 ms, which is 86 times longer than the rollSharpeRatio function in the DolphinDB Empyrical module.

3.2 Grouped Calculation in SQL Statements

The performance test data consists of daily return data for all stocks in market from January 3, 2019, to July 1, 2022, along with simulated factor data and portfolio weights (for testing computeExposure and perfAttrib functions), totaling 3,452,106 records. Sample data from the test files are provided in the appendix. After extraction, place them under the [home] directory.

The computation logic involves grouping by stock code to calculate each metric. To evaluate the function performance, both the DolphinDB and Python test codes are executed in single-threaded mode.

The test result is as shown in the table:


No.	Function	DolphinDB Time	Python Time	Python/DolphinDB
1	simpleReturns	190.7 ms	5,871.1ms	30.78
2	aggregateReturns	238.5 ms	11,167.8 ms	46.83
3	annualReturns	66.8 ms	889.1ms	13.30
4	cumReturns	84.7 ms	2,758.7ms	32.56
5	cumReturnsFinal	82.4 ms	835.4ms	10.13
6	alpha	131.2 ms	6,910.7 ms	52.67
7	rollAlpha	548.1 ms	5,143.4 ms	9.38
8	covarBeta	84.3ms	5,422.9 ms	64.32
9	rollCovarBeta	195.9 ms	4,306.8 ms	21.98
10	alphaBeta	112.5 ms	6,929.8 ms	61.59
11	rollAlphaBeta	547.7 ms	8,676.7 ms	15.84
12	upAlphaBeta	156.1 ms	5,653.1 ms	36.21
13	downAlphaBeta	148.6 ms	5,722.9 ms	38.52
14	betaFragilityHeuristic	262.7 ms	12,906.5 ms	49.14
15	annualVolatility	64.5 ms	776.5 ms	12.03
16	rollAnnualVolatility	136.9 ms	2,869.9 ms	20.96
17	downsideRisk	137.3 ms	679.0 ms	4.94
18	hisValueAtRisk	152.9 ms	710.3 ms	4.64
19	conditionalValueAtRisk	155.8 ms	306.2 ms	1.97
20	gpdRiskEstimates	7.3s	26.4s	3.61
21	tailRatio	235.4 ms	1,224.2 ms	5.20
22	capture	103.7 ms	2,343.0 ms	22.59
23	upCapture	165.7 ms	4,284.7 ms	25.69
24	downCapture	165.6 ms	4,254.3 ms	25.85
25	upDownCapture	232.8 ms	7,380.0 ms	31.70
26	rollUpCapture	30.1 s	2,204.8 s	73.33
27	rollDownCapture	30.4 s	2,195.3 s	72.32
28	rollUpDownCapture	53.7 s	4,146.4 s	77.20
29	omegaRatio	111.6 ms	3,946.0 ms	35.37
30	cumMaxDrawdown	99.4 ms	500.3ms	5.03
31	rollCumMaxDrawdown	1.4s	3.1s	2.11
32	calmarRatio	118.3 ms	1,338.4 ms	11.32
33	sharpeRatio	100.6 ms	974.7 ms	9.68
34	rollSharpeRatio	160.7 ms	3,277.3 ms	20.39
35	excessSharpe	103.6 ms	2,828.9 ms	27.31
36	sortinoRatio	180.5 ms	988.8 ms	5.47
37	rollSortinoRatio	287.0 ms	3,160.3 ms	11.01
38	stabilityOfTimeseries	365.5 ms	1,707.9 ms	4.67
39	computeExposure	183.6 ms	256.5 ms	1.39
40	perfAttrib	251.9 ms	3,484.8 ms	13.83

The test results show that the function performance of the DolphinDB Empyrical module surpasses that of the Python Empyrical library in most cases, with the maximum performance gap reaching 70 times, and an average gap of around 9 times.

Python pandas test code:

grouped = returns.groupby('SecurityID')
cum_returns = grouped['ret'].apply(lambda x: em.cum_returns(x))

DolphinDB test code:

cumReturns = 
    select SecurityID, DateTime, cumReturns(ret) as cumReturns
    from returns
    context by SecurityID

4. Consistency Verification

Based on the test data and code used in the performance comparison for grouped calculation, we also verify whether the calculation results of functions in the DolphinDB Empyrical module are consistent with those in the Python Empyrical library.

4.1 Handling of Null Values

If the input vector in Python Empyrical contains null values, the nulls are passed into the function and included in the calculation. The Empyrical module adopts the same strategy. The results produced by DolphinDB Empyrical are closely aligned with those of the Python Empyrical library, with only minor differences due to floating-point precision.

Note: While DolphinDB retains null values at the first (k–1) positions for strict window alignment, the Python Empyrical library skips initial nan values and begins computation as soon as a valid window is available.

DolphinDB Example

ret = NULL NULL 0.08 0.74 1.49 0.9 0.26 0.9 0.35 0.63 0.702 0.97 0.708 1.74
0.49 0.09 1.26 0.59 1.35 0.063 
rollAnnualVolatility(ret, windows =10)

/*
[,,,,,,,,,7.107626186006128,6.650903096572676,6.445987651244765,5.4412961691126505,
7.3080574710383885,6.603612950499143,7.349681897878303,7.448075187590415,
7.475411961892134,7.649430305584855,8.584637604465316]
*/

Python Example

ret = np.array([np.nan, np.nan, 0.08, 0.74, 1.49, 0.9, 0.26,
0.9, 0.35, 0.63, 0.702, 0.97, 0.708, 1.74, 0.49, 0.09, 1.26, 0.59, 1.35, 0.063])
print(em.roll_annual_volatility(ret,10))

/*
[7.10762619 6.6509031  6.44598765 5.44129617 7.30805747 6.60361295
 7.3496819  7.44807519 7.47541196 7.64943031 8.5846376 ]
 */

4.2 Optimized Output Format

The following functions generate different results from Python: alphaBeta, upAlphaBeta, downAlphaBeta, gpdRiskEstimates.

In Python Empyrical, the results of alphaBeta, upAlphaBeta, and downAlphaBeta combine alpha and beta into a two-dimensional array, requiring further post-processing.

In DolphinDB, alpha and beta can be directly separated into two columns using SQL query statements.

select SecurityID, alphaBeta(ret, factorRet) as `alpha`beta
from return1
group by SecurityID

While for the function gpdRiskEstimates, all the estimates are output in one column in Python:

In DolphinDB, the estimates are displayed in separate columns:

5. Real-Time Stream Processing

Most functions in the Empyrical module can be used in the reactive state engine as metrics for real-time incremental calculation. For example:

// clean environment
def cleanEnvironment(){
    try{ unsubscribeTable(tableName="inputTable",actionName="calculateEmpyrical") } 
    catch(ex){ print(ex) }
    try{ dropStreamEngine("EmpyricalReactiveSateEngine") } catch(ex){ print(ex) }
    try{ dropStreamTable(`inputTable) } catch(ex){ print(ex) }
    try{ dropStreamTable(`outputTable) } catch(ex){ print(ex) }
    undef all
}
cleanEnvironment()
go

// load module
use Empyrical

// load data
schema = table(`DateTime`SecurityID`ret`factor2`factor1`factor3`position as name, 
`DATE`SYMBOL`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE as type)
data=loadText(<YOUR_DIR>+"factors_data.csv" ,schema=schema)
// define stream table
share streamTable(1:0, `DateTime`SecurityID`ret`factor2`factor1`factor3`position, 
`DATE`SYMBOL`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE) as inputTable
share streamTable(1:0,`SecurityID`DateTime`sortinoRatio`annualVolatility`sharpeRatio,
`SYMBOL`DATE`DOUBLE`DOUBLE`DOUBLE) as outputTable

// register stream computing engine
reactiveStateMetrics=<[
    DateTime,
    Empyrical::rollAnnualVolatility(ret) as `sortinoRatio, 
    Empyrical::rollSharpeRatio(ret) as `annualVolatility, 
    Empyrical::rollSortinoRatio(ret) as `sharpeRatio
]>
createReactiveStateEngine(name="EmpyricalReactiveSateEngine", 
metrics=reactiveStateMetrics, dummyTable=inputTable, outputTable=outputTable,
keyColumn=`SecurityID, keepOrder=true)
subscribeTable(tableName="inputTable", actionName="calculateEmpyrical", 
offset=-1, handler=getStreamEngine("EmpyricalReactiveSateEngine"), 
msgAsTable=true, reconnect=true)
// replay data
submitJob("replay","replay",replay{data,inputTable,`tradedate,`tradedate,1000,true})

6. Empyrical Function Reference

6.1 Returns


Function	Syntax	Description
simpleReturns	simpleReturns(prices)	Simple returns
aggregateReturns	aggregateReturns(returns,date, convertTo='yearly')	Aggregate returns by week, month, or year
annualReturns	annualReturn(returns, period="daily", annualization=NULL)	Compound annual growth rate (CAGR)
cumReturns	cumReturns(returns, startingValue=0)	Cumulative simple returns
cumReturnsFinal	cumReturnsFinal(returns, startingValue=0)	Total simple returns

6.2 Alpha & Beta


Function	Syntax	Description
alpha	alpha(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL, beta=NULL)	Annualized alpha
rollAlpha	rollAlpha(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL, windows = 10)	Rolling window alpha
covarBeta	covarBeta(returns, factorReturns, riskFree=0.0)	Beta value
rollCovarBeta	rollCovarBeta(returns, factorReturns, riskFree=0.0, windows = 10)	Rolling window beta
alphaBeta	alphaBeta(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL)	Annualized alpha and beta
rollAlphaBeta	rollAlphaBeta(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL, windows=10)	Rolling window alpha and beta
upAlphaBeta	upAlphaBeta(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL)	Alpha and beta during periods of positive benchmark returns
downAlphaBeta	downAlphaBeta(returns, factorReturns, riskFree=0.0, period="daily", annualization=NULL)	Alpha and beta during periods of negative benchmark returns
betaFragilityHeuristic	betaFragilityHeuristic(returns, factorReturns)	Estimate fragility when beta declines

6.3 Risk Management


Function	Syntax	Description
annualVolatility	annualVolatility(returns, period="daily", annualization=NULL, alpha=2.0)	Annualized volatility
rollAnnualVolatility	rollAnnualVolatility(returns, period="daily", annualization=NULL, windows=10, alpha=2.0)	Rolling window annualized volatility
downsideRisk	downsideRisk(returns, requiredReturn=0.0, period="daily", annualization=NULL)	Downside deviation below a threshold
hisValueAtRisk	valueAtRisk(returns, cutoff=0.05)	Value at risk (VaR)
conditionalValueAtRisk	conditionalValueAtRisk(returns, cutoff=0.05)	Conditional value at risk (CVaR)
gpdRiskEstimates	gpdRiskEstimates(returns, varP=0.01)	Estimate VaR and ES using generalized pareto distribution (GPD)
tailRatio	tailRatio(returns)	Ratio between the right tail (95%) and the left tail (5%)

6.4 Capture Ratio


Function	Syntax	Description
capture	capture(returns, factorReturns, period="daily", annualization=NULL)	Capture ratio
upCapture	upCapture(returns, factorReturns, period="daily", annualization=NULL)	Up capture ratio during periods of positive benchmark returns
downCapture	downCapture(returns, factorReturns, period="daily", annualization=NULL)	Up capture ratio during periods of negative benchmark returns
upDownCapture	upDownCapture(returns, factorReturns, period="daily", annualization=NULL)	Ratio of up capture to down capture
rollUpCapture	rollUpCapture(returns, factorReturns, windows=10, period="daily", annualization=NULL)	Rolling up capture ratio
rollDownCapture	rollDownCapture(returns, factorReturns, windows=10, period="daily", annualization=NULL)	Rolling down capture ratio
rollUpDownCapture	rollUpDownCapture(returns, factorReturns, windows=10, period="daily", annualization=NULL)	Ratio of rolling up capture to rolling down capture

6.5 Risk-Adjusted Returns


Function	Syntax	Description
omegaRatio	omegaRatio(returns, period="daily", riskFree=0.0, requiredReturn=0.0, annualization=NULL)	Omega ratio
cumMaxDrawdown	maxdrawdown(returns)	Maximum drawdown
rollCumMaxDrawdown	rollCumMaxDrawdown(returns, windows = 10)	Rolling window maximum drawdown
calmarRatio	calmarRatio(returns, period="daily", annualization=NULL)	Calmar ratio of the strategy, i.e., drawdown ratio
sharpeRatio	sharpeRatio(returns, riskFree=0.0, period="daily", annualization=NULL)	Sharpe ratio
rollSharpeRatio	rollSharpeRatio(returns, riskFree=0.0, period="daily", annualization=NULL, windows=10)	Rolling Sharpe ratio
excessSharpe	excessSharpe(returns, factorReturns)	Excess Sharpe ratio
sortinoRatio	sortinoRatio(returns, requiredReturn=0.0, period="daily", annualization=NULL)	Sortino ratio
rollSortinoRatio	rollSortinoRatio(returns, requiredReturn=0.0, period="daily", annualization=NULL, windows=10)	Rolling Sortino ratio

6.6 Return Stability and Performance Attribution


Function	Syntax	Description
stabilityOfTimeseries	stabilityOfTimeseries(returns)	R² of the linear fit to cumulative log returns
computeExposure	computeExposures( factorLoadings, factorColumns=`factor1`factor2`factor3, positionName=`position`position`position, dtName=`DateTime)	Daily risk factor exposures
perfAttrib	perfAttrib(returns, factorReturns, factorLoadings, factorColumns=`factor1`factor2`factor3, positionName=`position`position`position, dtName=`DateTime)	Performance attribution (attributing investment returns to a set of risk factors)

Appendix

Empyrical.dos

samples.zip