T-Chisquare-F
Purpose
I want to tell a story of three friend, t distribution, Normal Distribution and Chi Square distribution.
Normal is easy to visualize
> set.seed(1977) > n <- 10000 > x <- rnorm(n) > par(mfrow = c(1, 1)) > hist(x, col = "grey", prob = T, ylim = c(0, 0.5), xlim = c(-3, + 3), main = "Normal Distribution", breaks = seq(-9, 9, 0.05), + xlab = "") > par(new = T) > plot(density(x), ylim = c(0, 0.5), col = "red", lty = 1, lwd = 3, + xlim = c(-3, 3), main = "") |

How does the density look for varying dof for t dist look
> par(new = F)
> cols <- rainbow(90)
> for (k in 2:90) {
+ K <- k
+ x <- rt(n, K)
+ plot(density(x), ylim = c(0, 1), col = cols[k], lty = 1,
+ lwd = 3, xlim = c(-3, 3), xlab = "", main = "")
+ par(new = T)
+ } |

ChiSquare visualization
> set.seed(1977)
> par(new = F)
> n <- 10000
> xrange <- c(0, 20)
> k <- 1:5
> cols <- rainbow(5)
> par(mfrow = c(1, 1))
> for (k in 1:5) {
+ x <- rchisq(n, k)
+ hist(x, prob = T, ylim = c(0, 0.5), xlim = xrange, main = "Chi Square Distribution",
+ xlab = "", col = cols[k])
+ par(new = T)
+ }
> legend("topright", legend = 1:5, fill = cols) |

How does the density look for varying dof for chi Square
> plot.new()
> par(new = F)
> k <- 1:8
> cols <- rainbow(8)
> for (k in 1:8) {
+ x <- rchisq(n, k)
+ plot(density(x), ylim = c(0, 1), col = cols[k], lty = 1,
+ lwd = 3, xlim = c(0, 30), xlab = "", main = "")
+ par(new = T)
+ }
> legend("topright", legend = 1:8, fill = cols) |

How does the density look for varying dof for chi Square standardized look
> par(new = F)
> k <- 1:90
> cols <- rainbow(90)
> for (k in 1:90) {
+ K <- k + 10
+ x <- rchisq(n, K)
+ x.t <- (x - K)/sqrt(2 * K)
+ plot(density(x.t), ylim = c(0, 1), col = cols[k], lty = 1,
+ lwd = 3, xlim = c(-3, 3), xlab = "", main = "")
+ par(new = T)
+ } |

NOW for the relation between the three dist
If x1 is N(0,1) and x2 is Chisq(k) then x1 / squareroot(x2/k) gives t dist
> par(new = F)
> n <- 1e+05
> k <- 3
> x <- rnorm(n)
> y <- rchisq(n, 3)
> y.hat <- y/3
> z <- x/sqrt(y.hat)
> plot(density(z), xlim = c(-4, 4), col = "blue", lwd = 2, main = "",
+ xlab = "")
> par(new = T)
> plot(density(x), xlim = c(-4, 4), col = "red", lwd = 2, main = "",
+ xlab = "")
> legend("topleft", legend = c("z", "normal"), fill = c("blue",
+ "red")) |

Clearly it is not normal
> par(new = F)
> z1 <- rt(n, k)
> plot(density(z), xlim = c(-4, 4), col = "blue", lwd = 2, main = "",
+ xlab = "")
> par(new = T)
> plot(density(z1), xlim = c(-4, 4), col = "red", lwd = 2, main = "",
+ xlab = "")
> legend("topleft", legend = c("z", "t"), fill = c("blue", "red")) |

- Thus the rv x1 / squareroot(x2/k) converges to t dist
Why … I know that ratio of two chi square is F and in that sense the above t stat is nothing but square root of F stat F(1,k) But what the hell is the connection between z and t
I AM COMPLETE DUMB FUCK BECOZ I HAVE NEVER LOOKED IN TO THE ASSUMPTIONS OF T TEST Fuck…I am 33 years now and in my whole life on this planet I have never every looked in to this The assumptions underlying a t-test are that
Most t-test statistics have the form T = Z/s, where Z and s are functions of the data.
- Z follows a standard normal distribution under the null hypothesis
- p times s square follows a Chi square distribution with p degrees of freedom under the null hypothesis, where p is a positive constant
- Z and s are independent.
So, the assumptions actually create a connection between t statistic and F statistic