Hobson GLM - Chapter 7-2
.Purpose To work out the example in Chap 7 section 7.8
Senility and WAIS
> folder <- "C:/Cauldron/garage/R/soulcraft/Volatility/Learn/Dobson-GLM/" > file.input <- paste(folder, "Table 7.8 Senility and WAIS.csv", + sep = "") > data <- read.csv(file.input, header = T, stringsAsFactors = F) |
> total <- as.data.frame(tapply(data$s, data$x, length))
> present <- as.data.frame(tapply(data$s, data$x, sum))
> temp <- cbind(as.numeric(rownames(total)), present[, 1], total[,
+ 1])
> temp <- data.frame(temp)
> colnames(temp) <- c("s", "y", "total")
> temp$ny <- temp$total - temp$y |
Using R
> fit <- glm(formula = cbind(temp$y, temp$ny) ~ temp$s, family = binomial(link = "logit"))
> summary(fit)
Call:
glm(formula = cbind(temp$y, temp$ny) ~ temp$s, family = binomial(link = "logit"))
Deviance Residuals:
Min 1Q Median 3Q Max
-0.9064 -0.6965 -0.2538 0.1719 1.7771
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.4040 1.1918 2.017 0.04369 *
temp$s -0.3235 0.1140 -2.838 0.00453 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 20.208 on 16 degrees of freedom
Residual deviance: 9.419 on 15 degrees of freedom
AIC: 27.792
Number of Fisher Scoring iterations: 5 |
> betas <- coef(fit)
> t1 <- exp(betas[1] + betas[2] * temp$s)
> t2 <- 1 + exp(betas[1] + betas[2] * temp$s)
> probs <- t1/t2
> temp$probs <- probs
> temp
s y total ny probs
4 4 1 2 1 0.75211453
5 5 1 1 0 0.68705597
6 6 1 2 1 0.61369267
7 7 2 3 1 0.53477641
8 8 2 2 0 0.45407982
9 9 2 6 4 0.37572578
10 10 1 6 5 0.30337860
11 11 1 6 5 0.23961509
12 12 0 2 2 0.18568111
13 13 1 6 5 0.14162581
14 14 2 7 5 0.10665418
15 15 0 3 3 0.07951811
16 16 0 4 4 0.05883161
17 17 0 1 1 0.04327366
18 18 0 1 1 0.03169146
19 19 0 1 1 0.02313427
20 20 0 1 1 0.01684746 |
Pearson Chi Square Statistic
> t1 <- temp$y - temp$total * temp$probs > t2 <- sqrt(temp$total * temp$probs * (1 - temp$probs)) > chisq.stat <- sum((t1/t2)^2) > print(chisq.stat) [1] 8.083029 |
Goodness of fit
> qchisq(0.95, 15) [1] 24.99579 |
Obviously model fits well
> loglikf <- function(idata, probs) {
+ t1 <- sum(idata$y * log(probs))
+ t2 <- sum(idata$ny * log(1 - probs))
+ t3 <- sum(log(choose(idata$total, idata$y)))
+ loglik <- t1 + t2 + t3
+ return(loglik)
+ }
> loglikf(temp, probs)
[1] -11.89593
> loglikf(temp, probs = rep(sum(temp$y)/sum(temp$total), 17))
[1] -17.2904 |
I am not getting the same values for loglike for null model and alternate model.