This is the sixth part of OBIEE 12c Advanced Analytic:
Linear Regression is a basic option of Advanced Analytics. I once wrote about achieving the regression line in OBIEE using the Oracle DB capabilities at "OBIEE and Linear Regression with Oracle DB".
Oracle documentation describes the REGR function: The REGR function fits a linear model and returns the fitted values or model. You can use this function to fit a linear curve on two measures.
Lets start with the simple example from the online documentation:
Analysis with columns Product Type, Brand Revenue and Discount Amount from Sample Sales. With a regression column. The simplified version is: REGR("Revenue", ("Discount Amount"), ("Product Type", "Brand"), 'fitted', '')
The full formula: REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product Type", "Products"."Brand"), 'fitted', '')
So we have a "fitted" regression of Revenue vs. Discount Amount, where each point is combination of Product Type and Brand.
Lets see the syntax documentation and do few more things:
fitted - returns the points on regression line (y=ax+b)
intercept - the intercept point with the zero on x axis (b from y=ax+b)
modelDescription - the Model in JSON format.
From my personal testing, All the options in the documentation (algorithm, targetNames, showModelDescription) seems irrelevant at the moment. Mostly because they have only 1 relevant default value.
It seems that adding the intercept or modelDescription output to the analysis hides the measures.
In case you are interested, the REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product Type", "Products"."Brand"), 'modelDescription', '') function returned the following JSON:
"{
"call": {
"": "lm",
"formula": "formula",
"data": "dat"
},
"terms": [
"~",
"target",
"arg1"
],
"residuals": {
"1": 41031,
"2": 88411,
"3": -1.3425e+05,
"4": 81339,
"5": -76497,
"6": -8371.3,
"7": 15573,
"8": -17995,
"9": -20080,
"10": -61277,
"11": 92117
},
"coefficients": [ {
"Estimate": 10364,
"Std. Error": 54600,
"t value": 0.18981,
"Pr(>|t|)": 0.85367
},
{
"Estimate": 29.39,
"Std. Error": 3.2705,
"t value": 8.9865,
"Pr(>|t|)": 8.6434e-06
} ],
"aliased": {
"(Intercept)": false,
"arg1": false
},
"sigma": 76935,
"df": [ 2, 9, 2 ],
"r.squared": 0.89973,
"adj.r.squared": 0.88859,
"fstatistic": {
"value": 80.757,
"numdf": 1,
"dendf": 9
},
"cov.unscaled": [ {
"(Intercept)": 0.50366,
"arg1": -2.7311e-05
},
{
"(Intercept)": -2.7311e-05,
"arg1": 1.8071e-09
} ]
}"
Usually we can add extra columns that are not in the REGR function to the analysis.
For example, similar to previous function, where I replaced Brand with Product and added the Department (from another dimension) to analysis.
REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product", "Products"."Product Type"), 'fitted', '')
The result is fine:
- OBIEE 12c Advanced Analytics Functions part 1. Introduction & Trendline
- OBIEE 12c Advanced Analytic part 2: BIN and WIDTH_BUCKET
- OBIEE 12c Advanced Analytic part 3: Forecast
- OBIEE 12c Advanced Analytic part 4: Cluster
- OBIEE 12c Advanced Analytic part 5: Outlier
- OBIEE 12c Advanced Analytic part 6: Regression (this one)
- OBIEE 12c Advanced Analytic part 7: EVALUATE_SCRIPT
Linear Regression is a basic option of Advanced Analytics. I once wrote about achieving the regression line in OBIEE using the Oracle DB capabilities at "OBIEE and Linear Regression with Oracle DB".
Oracle documentation describes the REGR function: The REGR function fits a linear model and returns the fitted values or model. You can use this function to fit a linear curve on two measures.
Lets start with the simple example from the online documentation:
Analysis with columns Product Type, Brand Revenue and Discount Amount from Sample Sales. With a regression column. The simplified version is: REGR("Revenue", ("Discount Amount"), ("Product Type", "Brand"), 'fitted', '')
The full formula: REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product Type", "Products"."Brand"), 'fitted', '')
So we have a "fitted" regression of Revenue vs. Discount Amount, where each point is combination of Product Type and Brand.
Lets see the syntax documentation and do few more things:
Syntax
REGR(y_axis_measure_expr, (x_axis_expr), (category_expr1, ..., category_exprN), output_column_name, options, [runtime_binded_options])Where:
y_axis_measure_expr
indicates the measure for which the regression model is to be computed.x_axis_measure_expr
indicates the measure used to determine the regression model for the y_axis_measure_expr
.category_expr1, ..., category_exprN
indicates the dimension/dimension attributes used to determine the category for which the regression model for the y_axis_measure_expr
is computed. One or up to five dimensions/dimension attributes can be provided as category columns.output_column_name
indicates the output column name for regression. Valid values are 'fitted', 'intercept', or 'modelDescription'.output_column_name options
explanation:fitted - returns the points on regression line (y=ax+b)
intercept - the intercept point with the zero on x axis (b from y=ax+b)
modelDescription - the Model in JSON format.
options
indicates a string list of name/value pairs separated by a semi-colon
(;). The value can include %1 ... %N, which can be specified using runtime_binded_options
.runtime_binded_options
indicates an optional comma-separated (,) list of runtime binded columns or literal expressions.From my personal testing, All the options in the documentation (algorithm, targetNames, showModelDescription) seems irrelevant at the moment. Mostly because they have only 1 relevant default value.
It seems that adding the intercept or modelDescription output to the analysis hides the measures.
In case you are interested, the REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product Type", "Products"."Brand"), 'modelDescription', '') function returned the following JSON:
"{
"call": {
"": "lm",
"formula": "formula",
"data": "dat"
},
"terms": [
"~",
"target",
"arg1"
],
"residuals": {
"1": 41031,
"2": 88411,
"3": -1.3425e+05,
"4": 81339,
"5": -76497,
"6": -8371.3,
"7": 15573,
"8": -17995,
"9": -20080,
"10": -61277,
"11": 92117
},
"coefficients": [ {
"Estimate": 10364,
"Std. Error": 54600,
"t value": 0.18981,
"Pr(>|t|)": 0.85367
},
{
"Estimate": 29.39,
"Std. Error": 3.2705,
"t value": 8.9865,
"Pr(>|t|)": 8.6434e-06
} ],
"aliased": {
"(Intercept)": false,
"arg1": false
},
"sigma": 76935,
"df": [ 2, 9, 2 ],
"r.squared": 0.89973,
"adj.r.squared": 0.88859,
"fstatistic": {
"value": 80.757,
"numdf": 1,
"dendf": 9
},
"cov.unscaled": [ {
"(Intercept)": 0.50366,
"arg1": -2.7311e-05
},
{
"(Intercept)": -2.7311e-05,
"arg1": 1.8071e-09
} ]
}"
Usually we can add extra columns that are not in the REGR function to the analysis.
For example, similar to previous function, where I replaced Brand with Product and added the Department (from another dimension) to analysis.
REGR("Base Facts"."Revenue", ("Base Facts"."Discount Amount"), ("Products"."Product", "Products"."Product Type"), 'fitted', '')
The result is fine:
No comments:
Post a Comment