# Inference for Multiple Linear Regression

If you see an error in the article, please comment or drop me an email.

# Inference for Multiple Linear Regression

```
#Load the data
cognitive <- read.csv("http://bit.ly/dasi_cognitive")
```

Let us start with the full model, thus including all variables:

```
#Fit the full model and show the summary
cog_full <- lm(kid_score ~ mom_hs + mom_iq + mom_work + mom_age, data = cognitive)
cog_full_summary <- summary(cog_full)
print(cog_full_summary)
```

```
##
## Call:
## lm(formula = kid_score ~ mom_hs + mom_iq + mom_work + mom_age,
## data = cognitive)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.045 -12.918 1.992 11.563 49.267
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.59241 9.21906 2.125 0.0341 *
## mom_hsyes 5.09482 2.31450 2.201 0.0282 *
## mom_iq 0.56147 0.06064 9.259 <2e-16 ***
## mom_workyes 2.53718 2.35067 1.079 0.2810
## mom_age 0.21802 0.33074 0.659 0.5101
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.14 on 429 degrees of freedom
## Multiple R-squared: 0.2171, Adjusted R-squared: 0.2098
## F-statistic: 29.74 on 4 and 429 DF, p-value: < 2.2e-16
```

## Hypothesis testing for models

Null Hypothesis: `beta_1 = beta_2 = beta_3 = ... = 0`

Alternative hypothesis: at least one `beta != 0`

**How to interpret the result of the F-Test**

If p-value of the F-test is lower than the significance level (.05), then the full model is significant. However, this does not mean that the model fits the data well. It only means that at least one of the `beta`

s is non-zero.

If the p-value exceeds the significance level, it means that the combination of the variables does not yield a good model. Certain individual variables, even among those included in the model, might still be good predictors of `y`

.

## Hypothesis testing for slopes

Null hypothesis: `beta_1 = 0`

when all other variables are included in the model

Alternative hypothesis: `beta_1 != 0`

when all other variables are included in the model

## The difference in DFs between SLR and MLR

Note that the degrees of freedom (DF) are to be calculated differently in multiple linear regression (MLR) than in Single Linear Regression (SLR):

MLR: `df = n - k - 1`

SLR: `df = n - 1 - 1`