=================================================== Introductory Biostatistics & Biostatistics using R ** BI217: Problem Set 1 =================================================== 1. Given a vector x=c(1,7,2,4,5,3), print the following results: (1) rank(x); (2) order(x); (3) sort(x); (4) unique(x) 2. if x=c(-4.7, 3.5, 0.7, 3.3, -1.4), then (1) round(x); (2) ceiling(x); (3) floor(x); (4) quantile(x) 3. Tell what the following R codes can do. (1) normalization m <- 10; n <- 5; x <- matrix(rnorm(m*n), m, n) colmeans <- apply(x, 2, mean) colsds <- apply(x, 2, sd) x <- sweep(x, 2, colmeans, "-") x <- sweep(x, 2, colsds, "/") (2) merging days <- c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"); days1 <- sample(days) days2 <- sample(days) x1 <- rnorm(7) names(x1) <- days1 x2 <- runif(7) names(x2) <- days2 m1 <- match(days, days1) m2 <- match(days, days2) x <- cbind(x1[m1], x2[m2]) rownames(x) <- days x (3) x <- rnorm(1000) y <- rnorm(1000) cols <- ifelse(y < -x^2 | y > x^2, 'lightblue', "darkgray") pchs <- ifelse(y < -x^2 | y > x^2, '+', "x") plot(x,y,type="n") points(x,y,col=cols, pch=pchs) curve(-x^2, lty=2, add=T, lwd=2) curve(x^2, lty=2, add=T, lwd=2) 4. On the standard normal curve, label the 95%CI of the mean: (1) two-tail; Hint: qnorm(0.025); qnorm(0.975) (2) lower-tail; Hint: qnorm(0.95) (3) upper-tail; Hint: qnorm(0.05) 5. There are some values you should remember, which are (1) qnorm(0.975), qnorm(0.025) (2) qt(0.95) 6. The file "STAT3.bed" contains coordinates for the ChIP-seq data mapping to the mouse genome, which correspond to the possible targets for the transcriptional factor --- STAT3. (1) Import the data into R using the command read.table(); (2) Count the number of incidences for each genome with the table() command; (3) Which of the genomes have the maximum incidences?