PickMyTest

Multiple Linear Regression

Models the relationship between a continuous dependent variable and multiple independent variables

Multiple Linear Regression#

Multiple linear regression models the relationship between a continuous dependent variable (criterion) and multiple independent variables (predictors). It allows examining the influence of several predictors simultaneously and making predictions.

When to Use#

Use multiple regression when you want to:

  • Examine the influence of multiple predictors on a dependent variable
  • The dependent variable is metric (continuous)
  • Make predictions based on multiple independent variables
  • Determine the relative contribution of individual predictors

Assumptions#

  • Linearity: Linear relationship between predictors and criterion
  • Normal distribution of residuals (Q-Q plot, Shapiro-Wilk test)
  • Homoscedasticity: Constant variance of residuals (Breusch-Pagan test)
  • No multicollinearity: Predictors are not too highly correlated (VIF < 10)
  • Independence of residuals (Durbin-Watson test)
  • No influential outliers (Cook's Distance)

Formula#

The regression model is:

Y=Ξ²0+Ξ²1X1+Ξ²2X2+β‹―+Ξ²kXk+Ξ΅Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_k X_k + \varepsilon

where:

  • YY is the dependent variable
  • Ξ²0\beta_0 is the intercept
  • Ξ²1,Ξ²2,…,Ξ²k\beta_1, \beta_2, \dots, \beta_k are the regression coefficients of the predictors
  • Ξ΅\varepsilon is the error term (normally distributed with mean 0)

The coefficients are estimated using Ordinary Least Squares (OLS):

Ξ²^=(XTX)βˆ’1XTY\hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{Y}

Example#

Practical Example: Salary Prediction

A personnel consultant wants to predict the salary of employees. The predictors used are:

  • X₁: Work experience (in years)
  • Xβ‚‚: Education level (years of education)
  • X₃: Weekly working hours

The model yields: Salary = 15,000 + 2,500 * Experience + 1,800 * Education + 300 * Hours

Interpretation: For each additional year of work experience, salary increases by an average of 2,500 EUR, holding all other variables constant (ceteris paribus).

Effect Size#

The coefficient of determination RΒ² and the adjusted RΒ² as measures of effect size:

R2=1βˆ’SSResidualsSSTotalR^2 = 1 - \frac{SS_{\text{Residuals}}}{SS_{\text{Total}}} Radj2=1βˆ’(1βˆ’R2)(nβˆ’1)nβˆ’kβˆ’1R^2_{\text{adj}} = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}
Effect SizeRΒ² (Cohen)
Small0.02
Medium0.13
Large0.26

Additionally, Cohen's fΒ² provides the effect size:

f2=R21βˆ’R2f^2 = \frac{R^2}{1 - R^2}

Important: A high RΒ² does not automatically imply a causal model. Standardized coefficients (beta weights) allow comparison of the relative importance of predictors.

Further Reading

  • Cohen, J., Cohen, P., West, S. G. & Aiken, L. S. (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.). Lawrence Erlbaum Associates.
  • Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.