通用计算函数库
在头文件 OperatorImp.h
中声明了与数据分析相关的高效且易用的函数。这些函数涵盖了数值计算、数据类型转换、累计窗口系列、滑动窗口系列以及行计算系列等多个类别,为数据分析提供了强大的支持,使得
Swordfish 在数据处理与分析领域具备了广泛的应用价值。
用户在代码中须通过 #include
指令获取头文件 Swordfish.h
和
OperatorImp.h
,且使用前应通过
DolphinDBLib::initializeRuntime()
初始化运行时,使用后通过
DolphinDBLib::finalizeRuntime()
关闭运行时。
#include "Swordfish.h"
#include "OperatorImp.h"
int main()
{
DolphinDBLib::initializeRuntime();
// 代码实现
DolphinDBLib::finalizeRuntime();
return 0;
}
一元函数
以 OperatorImp::log
为例,计算 5 的自然对数。调用时接口的第二个参数通过
Expression::void_
传入 void。
// 定义待计算的数
ConstantSP a = new Int(5);
// 调用 log 计算自然对数
ConstantSP result = OperatorImp::log(a,Expression::void_);
std::cout <<result->getString() << std::endl;
数值运算
以 OperatorImp::ratio
为例,计算 12 和 5 的比率。
// 定义待计算的两数
ConstantSP a = new Int(5);
ConstantSP b = new Int(12);
// 调用 ratio 计算二者比率
ConstantSP result = OperatorImp::ratio(b,a);
std::cout <<result->getString() << std::endl;
数据类型转换
以 OperatorImp::asDecimal64
为例,将数据转换为 DECIMAL64 类型。
// 定义待转换的数和保留的小数位数
ConstantSP a = new Double(8.6767676);
ConstantSP scale = new Int(6);
// 调用 asDecimal64 转换数据类型
ConstantSP result = OperatorImp::asDecimal64(a, scale);
std::cout <<result->getString() << std::endl;
数据处理
以 OperatorImp::rand
为例,随机生成 10 个不大于 100 的整数。
// 定义生成
ConstantSP X = new Int(100);
ConstantSP count = new Int(10);
// 调用 rand 生成随机向量
ConstantSP v = OperatorImp::rand(X, count);
std::cout << v->getString() << std::endl;
以 OperatorImp::diag
为例,可通过向量 v 生成对角矩阵,或根据方阵 m 返回主对角线元素。
// 通过向量 v 生成对角矩阵
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {2, 4, 6, 8, 10};
v->appendInt(newData.data(), newData.size());
ConstantSP result1 = OperatorImp::diag(v, Expression::void_);
std::cout << result1->getString() << std::endl;
// 根据方阵 m 返回主对角线元素
int *rawData = new int[9]{1, 2, 3, 4, 5, 6, 7, 8, 9};
VectorSP m = Util::createMatrix(DT_INT, 3, 3, 9, 0, rawData);
ConstantSP result2 = OperatorImp::diag(m, Expression::void_);
std::cout << result2->getString() << std::endl;
以 OperatorImp::deltas
为例,计算向量 v 中相邻元素的差值。
// 定义待计算向量 v
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {7, 2, 5, 8, 9};
v->appendInt(newData.data(), newData.size());
// 调用 deltas 计算相邻元素的差值
ConstantSP result = OperatorImp::deltas(v, Expression::void_);
std::cout << result->getString() << std::endl;
累计窗口系列
以 OperatorImp::cumsum
为例,计算向量 v1 中元素的累计和。
// 定义待计算向量 v
VectorSP v1 = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
v1->appendInt(newData.data(), newData.size());
// 调用 cumsum 计算结果
ConstantSP result = OperatorImp::cumsum(v1,Expression::void_);
std::cout <<result->getString() << std::endl;
滑动窗口系列
以 OperatorImp::mmax
为例,计算长度为 4 的滑动窗口内的最大值。
// 构建调用所需参数
VectorSP X = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {1, 2, 3, 4, 5, 6, 7};
X->appendInt(newData.data(), newData.size());
ConstantSP window = new Int(4);
std::vector<ConstantSP> parameter = {X, window};
// 调用 mmax 计算结果
SessionSP session = DolphinDBLib::createSession();
ConstantSP result = OperatorImp::mmax(session->getHeap().get(),parameter);
std::cout <<result->getString() << std::endl;
行系列函数
以 OperatorImp::rowImin
为例,返回矩阵 m 每行元素中最小元素的索引。
// 定义矩阵 m
double *rawData = new double[12]{4.5, 2.6, 1.5, 3.2, 1.5, 4.8, 5.9, 1.7, 4.9, 2.0, 6.2, 5.5};
VectorSP m = Util::createMatrix(DT_DOUBLE, 3, 4, 12, 0, rawData);
std::vector<ConstantSP> parameter = {m};
SessionSP session = DolphinDBLib::createSession();
// 调用 rowImin 计算结果
ConstantSP result = OperatorImp::rowImin(session->getHeap().get(),parameter);
std::cout << result->getString() << std::endl;
向量函数
以 OperatorImp::move
为例,计算向量 v 向右移动 3 个位置后的结果。
// 定义向量和移动长度
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {3, 9, 5, 1, 4, 9};
v->appendInt(newData.data(), newData.size());
ConstantSP step = new Int(3);
// 调用 move 计算结果
ConstantSP result = OperatorImp::move(v, step);
std::cout << result->getString() << std::endl;
聚合函数
OperatorImp::avg
为例,计算向量元素的平均值。// 定义矩阵
int *rawData = new int[9]{1, 2, 3, 4, 5, 6, 7, 8, 9};
VectorSP m = Util::createMatrix(DT_INT, 3, 3, 9, 0, rawData);
// 调用 avg 计算结果
ConstantSP result = OperatorImp::avg(m, Expression::void_);
std::cout << result->getString() << std::endl;
以 OperatorImp::beta
为例,计算 y 在 x 上的回归系数的最小二乘估计。
// 定义向量 y 和 x
VectorSP x = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData1 = {1, 3, 5, 7, 11, 16, 23};
x->appendInt(newData1.data(), newData1.size());
VectorSP y = Util::createVector(DT_DOUBLE, 0, 10);
std::vector<double> newData2 = {0.1, 4.2, 5.6, 8.8, 22.1, 35.6, 77.2};
y->appendDouble(newData2.data(), newData2.size());
// 调用 beta 计算结果
ConstantSP result = OperatorImp::beta(y, x);
std::cout << result->getString() << std::endl;
时间函数
以 OperatorImp::date
为例,将其他时间类型转换为日期。
// 定义时间戳
ConstantSP timestamp = new Timestamp(2024, 8, 19, 11, 13, 29, 326);
std::cout << timestamp->getString() << std::endl;
// 调用 date,将该时间戳转化为日期
ConstantSP date = OperatorImp::date(timestamp, Expression::void_);
std::cout << date->getString() << std::endl;
字符串函数
以 OperatorImp::like
为例,判断 x 中是否包含字符串 y。
// 定义字符串和匹配字符串
ConstantSP x = new String("ABCDEFG");
ConstantSP y = new String("%DE%");
// 调用 like 匹配结果
ConstantSP result = OperatorImp::like(x, y);
std::cout << result->getString() << std::endl;
高阶函数
以 OperatorImp::eachFuncCall
为例,把自定义函数应用到向量 a 和 b 的每一个元素。完整代码如下:
#include "Swordfish.h"
#include "OperatorImp.h"
// 定义函数 myAdd
ConstantSP myAdd(const ConstantSP& a, const ConstantSP& b){
return (a->getInt() % 2 == 0) ? new Int(0) : new Int(a->getInt() + b->getInt());
};
int main()
{
DolphinDBLib::initializeRuntime();
// 构建调用所需参数
SessionSP session = DolphinDBLib::createSession();
VectorSP a = Util::createVector(DT_INT, 0, 5);
std::vector<int> newData1 = {1, 2, 3, 4, 5};
a->appendInt(newData1.data(), newData1.size());
VectorSP b = Util::createVector(DT_INT, 0, 5);
std::vector<int> newData2 = {1, 2, 3, 4, 5};
b->appendInt(newData2.data(), newData2.size());
std::string funcName = "myFunc";
ConstantSP myFunc1 = Util::createOperatorFunction(funcName, myAdd, 2, 2, true);
std::vector<ConstantSP> parameter = {myFunc1, a, b};
// 调用 eachFuncCall,计算结果
ConstantSP result = OperatorImp::eachFuncCall(session->getHeap().get(), parameter);
std::cout << result->getString() << std::endl;
DolphinDBLib::finalizeRuntime();
return 0;
}
API 参考
-
数学:abs, acos, acosh, add, asin, asinh, atan, atanh, cbrt, clip, clip!, cos, cosh, cholesky, derivative, diag, div, det, eig, exp, exp2, expm1, gram, gramSchmidt, integral, inverse, intersection, iterate, log, log10, log1p, log2, lu, mod, mul, neg, pow, ratio, reciprocal, repmat, sin, sinh, sqrt, square, sub, symmetricDifference, svd, tan, tanh, tril ,triu, schur, signbit, signum, qr
-
统计:atImax, atImin, avg, contextSum, contextSum2, count, covar, covarMatrix, crossStat, cumnunique, demean, dot, ewmCov, ewmMean, ewmStd, ewmVar, gaussianKde, gaussianKdePredict, imax, imin, kurtosis, mad, max, maxIgnoreNull, med, mean, med, min, minIgnoreNull, mode, mmed, nunique, percentChange, percentile, percentileRank, prod, quantile, quantileSeries, sem, skew, std, stdp, summary, sum, sum2, sum3, sum4, stat, var, varp, wavg, wc, wcovar, wsum
-
相关性:acf, autocorr, corr, corrMatrix, distance, ewmCorr, euclidean, kendall, mutualInfo, rowEuclidean, rowTanimoto, spearmanr, tanimoto
-
序列分析:isMonotonicIncreasing/isMonotonic, isMonotonicDecreasing, isPeak, isValley, zigzag
-
机器学习:adaBoostClassifier, adaBoostRegressor, beta, elasticNet, elasticNetCV, gaussianNB, glm, gmm, kmeans, knn, lasso, lassoBasic, lassoCV, logisticRegression, mmse, msl, multinomialNB, ols, olsEx, piecewiseLinFit, poly1d, polyFit, polynomial, predict, pwlfPredict, randomForestClassifier, randomForestRegressor, residual, ridge, ridgeBasic, vectorAR, wls
-
分布与假设检验:adfuller, anova, cdfBeta, cdfBinomial, cdfChiSquare, cdfExp, cdfF, cdfGamma, cdfKolmogorov, cdfLogistic, cdfNormal, cdfPoisson, cdfStudent, cdfUniform, cdfWeibull, cdfZipf, chiSquareTest, coint, esd, fTest, invBeta, invBinomial, invChiSquare, invExp, invF, invGamma, invLogistic, invNormal, invStudent, invPoisson, invUniform, invWeibull, ksTest, mannWhitneyUTest, manova, norm/normal, rand, randBeta, randBinomial, randChiSquare, randDiscrete, randExp, randF, randGamma, randLogistic, randMultivariateNormal, randNormal, randPoisson, randStudent, randUniform, randWeibull, seasonalEsd, shapiroTest, tTest, zTest
-
数据处理:all, any, asis, asof, bucketCount, coeven, cols, copy, contextCount, countNanInf, cumPositiveStreak, deltas, dictUpdate!, distinct, dynamicGroupCumcount, dynamicGroupCumsum, hashBucket, iif, imaxLast, iminLast, isDuplicated, keys, linearTimeTrend, lowerBound, lowRange, mask, maxPositiveStreak, mimaxLast, miminLast, mmaxPositiveStreakxPositiveStreak, pca, ratios, resample, rowImaxLast, rowIminLast, rows, sessionWindow, shape, size, stl, sumbars, talibNull, tmove, topRange, valueChanged, values, winsorize!, winsorize, zscore
-
插值:cubicSpline, cubicSplinePredict, dividedDifference, kroghInterpolate, loess, neville, spline, splrep, splev
-
优化:brute, brentq, fmin, fminBFGS, fminLBFGSB, fminNCG, fminSLSQP, linprog, osqp, qclp, quadprog, scs, solve, socp
-
金融分析:CVaR, bondAccrInt, bondConvexity, bondDirtyPrice, bondDuration, nss, nsspredict, trueRange, VaR