Glossary of Statistical Terms: Definition Regression

Regression indicates a relationship between two or more variables. In a regression analysis, it is assumed that there is a directed linear interdependence, i.e. there exists a dependent variable and at least one independent variable. Which variable is dependent or independent must be possible to deduce on a logical basis.

With the help of regression analysis we can calculate a regression function, which describes the dependence of the two variables with a straight line. The calculated regression line allows us to make predictions for the dependent variable by inserting a value for the independent variable. Reverse inferences are not allowed, though.

Regression analyzes are often performed for variables that have a correlation, i.e. a statistical dependence has previously been determined.

An example:

We have determined that there is a positive correlation for the two cardinal features age and wealth. If there is a rise in one variable, the other variable will rise as well. On a logical basis we can determine that 'age' is the independent variable – older age means more assets. The conclusion that more wealth leads to higher age would be illogical.

Now, with the help of a regression analysis, we can determine the slope of the regression line. In this case: If the age of an American rises by one year, his assets will rise by an average of $2,500 (=the regression coefficient).

Important: This statement does not describe a causal relationship, i.e. it does not imply that a higher age equals higher assets. It simply states that a linear interdependence between the two variables can be observed. 

Please note that the definitions in our statistics encyclopedia are simplified explanations of terms. Our goal is to make the definitions accessible for a broad audience; thus it is possible that some definitions do not adhere entirely to scientific standards.

Entries starting with R