.Purpose To work out exercise 6.4 from Hobson’s book '' It is well known that the concentration ofc holesterol in blood serum increases with age but it is less clear whether cholesterol level is also associated with body weight. Table 6.17 shows for thirty women serumcholesterol (millimoles per liter), age (years) and body mass index (weight divided by height squared, where weight was measured in kilograms and height in meters). Use multiple regression to test whether serum cholesterol is associated with body mass index when age is already included in the model.

> folder <- "C:/Cauldron/garage/R/soulcraft/Volatility/Learn/Dobson-GLM/"
> file.input <- paste(folder, "chol.csv", sep = "")
> data <- read.csv(file.input, header = T, stringsAsFactors = F)
> summary(aov(data$chol ~ data$age + data$bmi))
            Df  Sum Sq Mean Sq F value    Pr(>F)
data$age     1 18.0666 18.0666 18.3585 0.0002076 ***
data$bmi     1  5.0655  5.0655  5.1474 0.0314875 *
Residuals   27 26.5706  0.9841
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> summary(aov(data$chol ~ data$age))
            Df Sum Sq Mean Sq F value    Pr(>F)
data$age     1 18.067  18.067   15.99 0.0004216 ***
Residuals   28 31.636   1.130
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> pf((31.636 - 26.5706)/(26.5706/27), 1, 27, lower.tail = F)
[1] 0.03148935

BMI is needed
Ok. let me check with plain simple reg

> summary(lm(chol ~ age, data))
Call:
lm(formula = chol ~ age, data = data)
Residuals: Min 1Q Median 3Q Max -2.29944 -0.67361 0.02992 0.40873 2.39393
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.29561 0.70480 4.676 6.72e-05 *** age 0.05344 0.01336 3.999 0.000422 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.063 on 28 degrees of freedom Multiple R-squared: 0.3635, Adjusted R-squared: 0.3408 F-statistic: 15.99 on 1 and 28 DF, p-value: 0.0004216
> summary(lm(chol ~ age + bmi, data))
Call:
lm(formula = chol ~ age + bmi, data = data)
Residuals: Min 1Q Median 3Q Max -1.76187 -0.73530 -0.02050 0.37716 2.37169
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.73983 1.89641 -0.390 0.69951 age 0.04097 0.01363 3.006 0.00567 ** bmi 0.20137 0.08876 2.269 0.03149 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.992 on 27 degrees of freedom Multiple R-squared: 0.4654, Adjusted R-squared: 0.4258 F-statistic: 11.75 on 2 and 27 DF, p-value: 0.0002130

If I run a multiple regression it is showing bmi is significant!!