Inclass exercises, Day VIII

Follow this link for day 8 materials and notes

Solutions for these exercises are available here

Random sampling

Draw some random samples from our trusty iris dataset using one of the following functions: the base R function sample(), the dplyr function sample_n(), or the dplyr function sample_frac().

Randomly sample 10 setosa petal lengths without replacement
Randomly sample 10 setosa petal lengths with replacement
Randomly sample 10 rows from the iris dataframe without replacement
Randomly sample 10 rows from the iris dataframe with replacement
Randomly sample 25% of rows from the iris dataframe without replacement

Permutation test

The following data are numbers of virions produced (burst sizes) for viruses that have been treated with a mutagen, and for control viruses. Samples between mutagen and control are fully independent. Use a permutation test, with the t statistic, to determine whether the mutagen has an effect on viral burst size.

Mutagen burst sizes: 15, 29, 58, 103, 1048
Control burst sizes: 7, 29, 254, 921, 5611

Bootstrap

Use the bootstrap approach to estimate the median and its associated 95% confidence interval for the difference in burst sizes between mutagen and control viruses. Again, remember that these groups are not paired.

Multiple testing

The built-in R dataset InsectSprays gives the counts of insects in agricultural experimental units treated with different insecticides. Perform all combinations of independent two-sample tests (use sample size per grouping to determine if you should use t-tests or Mann Whitney U tests) to ask which insecticides, if any, tend to lead to different numbers of insects. Use the Bonferroni correction.

Hint: to more easily find how many are significant, broom::tidy() and filter() are useful!