1. For both the heights and heights3 data sets
(1) Find estimates for the mean and variance
(2) Find 95% confidence intervals for the means
(3) Find 99% confidence intervals for the variances
(4) Test that the means are equal to 170cm.
(5) Test that the variances are equal to 10.
2. Test for the equality of the variances of the heights and heights3 data sets.
3. Construct a 95% confidence interval for the difference between the means of the
heights and heights3 data sets.
4. Write an R function that performs a two sided t-test for the equality of two normal
means. By simulating data sets and applying your t-test function show the following:
(1) when the null hypothesis holds, the p-value behaves as a U(0,1) random variable.
(2) The test has more power to detect a difference when the means are further apart.
(3) The test has more power to detect a difference when the data sets are large.
(4) Violations of the underlying assumptions affect the correct performance of the
test. For example, what happens if the variances are not equal? What happens
if the data is not Normal?
5. The following R function performs a t-test without correcting for estimating a
variance by using a t distribution instead of a Normal. In days of yore, actually in
the days of books of statistical tables, this was how a t-test was done for n > 30 say.
By simulating data sets of various sizes explore the performance of this old fashioned
test.
ttest.old <- function(x,y) {
n = length(x)
m = length(y)
sp = ( (n-1)*var(x) + (m-1)*var(y) ) / (n+m-2)
t = abs(mean(x)-mean(y)) / sqrt(sp*(1/n+1/m))
pnorm(-t) + pnorm(t,lower.tail=FALSE)
}