Glossary of Statistical Terms: Definition Regression analysis

The regression analysis is an analytical method which allows us to calculate a regression as a straight line or regression function. The regression informs us about the linear directed dependence between two or more variables. The so-called coefficient of determination (R²) expresses the quality of a representation of the relationship between the independent and dependent variable through the regression function. The values of R² lie between 0 and 1, whereas R²=1 would mean that all observed data points would be directly on the regression function.

The calculation of a regression function does not necessarily mean that the determined dependence is significant. This means that the calculated dependence between two or more variables, which is valid for the sample, can be transferred to the population. The admissibility of this transfer – in other words the significance of the regression, is determined through an F-test. If we wanted to determine the interdependence of several independent variables in relation to a dependent variable, we would determine the significance of the independent variable through a t-test.

The significance of a regression is based on the sufficiency of the model. If a model does not factor in one or more independent variables, it might still have a mathematically correct result, but the actual interdependencies will not be revealed.

An example:

'Annual mean ice cream sales' are calculated based on the independent variable 'daily average temperature'. The calculated relationship, represented by the regression function, is significant, i.e. it is true for the population. So in this case, the sales of ice cream increase with every degree Fahrenheit by about 12%.

But, if we consider that only the average daily temperature was included in our calculation, but not the price per scoop of ice cream - we are missing a crucial factor regarding ice cream sales. In our example, the price per scoop does have an influence on ice cream sales as well. The price increases during the summer months and plummets when fall arrives. Our previous calculation of a regression is therefore distorted. If we would integrate scoop prices into our model correctly, we could determine that, at a constant ice cream price, each degree Fahrenheit would not lead to 12%, but to 16% increase in sales. Due to the price increase during the summer months, this increase in sales is being thwarted though. 

Please note that the definitions in our statistics encyclopedia are simplified explanations of terms. Our goal is to make the definitions accessible for a broad audience; thus it is possible that some definitions do not adhere entirely to scientific standards.

Entries starting with R