Hypothesis testing - Multiple Reg
Purpose
Simulate Multiple Reg and test the Hypothesis
> set.seed(1977)
> n <- 15000
> beta.actual <- matrix(c(2, 3, 4, 5), ncol = 1)
> beta.sample <- cbind(rnorm(n, beta.actual[1]), rnorm(n, beta.actual[2]),
+ rnorm(n, beta.actual[3]), rnorm(n, beta.actual[4]))
> error <- rnorm(n)
> x <- cbind(rep(1, n), runif(n), runif(n), runif(n))
> y <- x[, 1] * beta.sample[, 1] + x[, 2] * beta.sample[, 2] +
+ x[, 3] * beta.sample[, 3] + x[, 4] * beta.sample[, 4] + error
> summary(lm(y ~ x + 0))
Call:
lm(formula = y ~ x + 0)
Residuals:
Min 1Q Median 3Q Max
-6.877817 -1.153599 -0.001764 1.160492 6.736600
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x1 1.95455 0.04478 43.65 <2e-16 ***
x2 3.02972 0.04889 61.97 <2e-16 ***
x3 3.98549 0.04898 81.37 <2e-16 ***
x4 5.09754 0.04900 104.03 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.737 on 14996 degrees of freedom
Multiple R-squared: 0.9578, Adjusted R-squared: 0.9578
F-statistic: 8.515e+04 on 4 and 14996 DF, p-value: < 2.2e-16 |
Test this Hypothesis b2 = 0 My mind says that there are atleast 3 ways to do it.. Firstly the restricted and unrestricted case
> library(faraway) > g <- lm(sr ~ pop15 + pop75 + dpi + ddpi, savings) > RSS0 <- sum(resid(g)^2) > g.res <- lm(sr ~ pop75 + dpi + ddpi, savings) > RSS1 <- sum(resid(g.res)^2) > fstat <- ((RSS1 - RSS0)/3)/(RSS0/46) > fstat [1] 3.464173 > qf(0.95, 3, 46) [1] 2.806845 |
Since the value is greater than 95 percent level, reject the hypothesis that the coefficient of pop15 =0
Second case is anova
> g1 <- lm(sr ~ pop15 + pop75 + dpi + ddpi, savings) > g2 <- lm(sr ~ pop75 + dpi + ddpi, savings) > anova(g1, g2) Analysis of Variance Table Model 1: sr ~ pop15 + pop75 + dpi + ddpi Model 2: sr ~ pop75 + dpi + ddpi Res.Df RSS Df Sum of Sq F Pr(>F) 1 45 650.71 2 46 797.72 -1 -147.01 10.167 0.002603 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 |
Null Hypo is that pop15 = 0 which is obviously rejected if you look at ftest results Third is the complicated procedure of actually computing variance of Rb-r Actually the results will produce the same fstat as procedure 1