olsEx
Syntax
olsEx(ds, Y, X, [intercept=true], [mode=0])
Details
Return the result of an ordinary-least-squares regression of Y on X. Y and X are columns in a partitioned table.
Note that NULL values in X and Y are treated as 0 in calculations.
Arguments
ds a set of data sources stored in a tuple. It is usually generated by the function sqlDS.
Y a string indicating the column name of the dependent variable from the table represented by ds.
X a string scalar/vector indicating the column name(s) of independent variable(s) from the table represented by ds.
intercept is a boolean variable indicating whether the regression includes the intercept. If it is true, the system automatically adds a column of 1's to X to generate the intercept. The default value is true.
-
0: a vector of the coefficient estimates
-
1: a table with coefficient estimates, standard error, t-statistics, and p-value
-
2: a dictionary with all statistics
Examples
n=10000
ID=rand(100, n)
dates=2017.08.07..2017.08.11
date=rand(dates, n)
vol=rand(1..10 join int(), n)
price=rand(100,n)
t=table(ID, date, vol,price)
saveText(t, "D:/DolphinDB/Data/t.txt");
if(existsDatabase("dfs://rangedb")){
dropDatabase("dfs://rangedb")
}
db = database(directory="dfs://rangedb", partitionType=RANGE, partitionScheme=0 51 101)
USPrices=loadTextEx(dbHandle=db,tableName=`USPrices, partitionColumns=`ID, filename="/home/DolphinDB/Data/t.txt");
ds=sqlDS(<select vol as VS, price as SBA from USPrices where vol>5>)
rs=olsEx(ds, `VS, `SBA, true, 2)
rs;
// output
RegressionStat->
item statistics
------------ ----------
R2 0.000848
AdjustedR2 0.000628
StdError 1.404645
Observations 4535
ANOVA->
Breakdown DF SS MS F Significance
---------- ---- ----------- -------- -------- ------------
Regression 1 7.592565 7.592565 3.848178 0.049861
Residual 4533 8943.739298 1.973029
Total 4534 8951.331863
Coefficient->
factor beta stdError tstat pvalue
--------- -------- -------- ---------- --------
intercept 7.953084 0.04185 190.039423 0
SBA 0.001422 0.000725 1.961677 0.049861