Multiple Linear Regression#
Multiple linear regression models the relationship between a continuous dependent variable (criterion) and multiple independent variables (predictors). It allows examining the influence of several predictors simultaneously and making predictions.
When to Use#
Use multiple regression when you want to:
- Examine the influence of multiple predictors on a dependent variable
- The dependent variable is metric (continuous)
- Make predictions based on multiple independent variables
- Determine the relative contribution of individual predictors
Assumptions#
- Linearity: Linear relationship between predictors and criterion
- Normal distribution of residuals (Q-Q plot, Shapiro-Wilk test)
- Homoscedasticity: Constant variance of residuals (Breusch-Pagan test)
- No multicollinearity: Predictors are not too highly correlated (VIF < 10)
- Independence of residuals (Durbin-Watson test)
- No influential outliers (Cook's Distance)
Formula#
The regression model is:
where:
- is the dependent variable
- is the intercept
- are the regression coefficients of the predictors
- is the error term (normally distributed with mean 0)
The coefficients are estimated using Ordinary Least Squares (OLS):
Example#
Practical Example: Salary Prediction
A personnel consultant wants to predict the salary of employees. The predictors used are:
- Xβ: Work experience (in years)
- Xβ: Education level (years of education)
- Xβ: Weekly working hours
The model yields: Salary = 15,000 + 2,500 * Experience + 1,800 * Education + 300 * Hours
Interpretation: For each additional year of work experience, salary increases by an average of 2,500 EUR, holding all other variables constant (ceteris paribus).
Effect Size#
The coefficient of determination RΒ² and the adjusted RΒ² as measures of effect size:
| Effect Size | RΒ² (Cohen) |
|---|---|
| Small | 0.02 |
| Medium | 0.13 |
| Large | 0.26 |
Additionally, Cohen's fΒ² provides the effect size:
Important: A high RΒ² does not automatically imply a causal model. Standardized coefficients (beta weights) allow comparison of the relative importance of predictors.
Further Reading
- Cohen, J., Cohen, P., West, S. G. & Aiken, L. S. (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Lawrence Erlbaum Associates.
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.