For this assignment you will be recreating plots shown in these instructions made from the sparrows
dataset, which we saw during in-class exercises using ggplot2
. The sparrows
dataset is part of the ds4b.materials
package, so by loading that library, you are loading that dataset.
Obtain the homework from your RStudio Cloud class project by running the following code in the R Console:
library(ds4b.materials) # Load the class library
launch_homework(4) # Launch Homework 4
Answer each question with appropriate code and comments in the given question’s R chunk. Chunks are named as plot1
for plot 1, plot2
for plot 2, etc.
You must set an RMarkdown theme and code syntax highlighting scheme of your choosing in the YAML front matter. These links will help you:
Make sure your Rmd knits without errors before submitting. If it does not produce an HTML output, this means it does not knit. DO NOT SKIP THIS STEP! Ensuring code runs without errors is MORE IMPORTANT than writing code in the first place.
As always, you are encouraged to work together and use the class Slack to help each other out, but you must submit YOUR OWN CODE.
Explore the dataset interactively in the Console before you begin plotting, and even then, keep looking at it!! You cannot work with data that you aren’t looking at, and nobody ever expects you to. You might want to run
View(sparrows)
in Console (not in the RMarkdown!!) to keep a full view of the dataset open at all times.
For each question, you should write code to recreate the plot. Save the plot to a variable and THEN reveal the plot by typing out the variable. The purpose of this instruction is to make you practice saving plots to variables! For example (plots are small, for demo only):
## YES!!
# Make and save plot
<- ggplot(iris) +
example_plot aes(x = Sepal.Length) +
geom_histogram()
# reveal plot example_plot
## YES!!
# OR, assign *forwards* if you prefer that style:
# Make and save plot
ggplot(iris) +
aes(x = Sepal.Length) +
geom_histogram() -> example_plot
# reveal plot example_plot
## NO!! THIS DOESN'T SAVE TO A VARIABLE!! NO!!
ggplot(iris) +
aes(x = Sepal.Length) +
geom_histogram()
Your code must be spaced out onto separate lines as we have learned in class. There will be deductions for plotting code that is all on one line. The purpose of this instruction is to force you build this organizational habit as early as possible. For example…
## YES!!
ggplot(iris) +
aes(x = Sepal.Length) +
geom_histogram() -> example_plot
## NO!!
ggplot(iris) + aes(x = Sepal.Length) + geom_histogram() -> example_plot
When recreating the plots, you DO NOT NEED to specifically recreate:
size=2
but yours do not, that’s OK! No deductions! You can’t really eyeball this stuff.When recreating the plots, you DO NEED to make sure you recreate:
theme_classic()
theme_minimal()
theme_bw()
theme_gray()
(This is the default theme. If the plot uses this theme, you do NOT need to code it - it’s the default!)theme_linedraw()
or theme_light()
. Those do NOT APPEAR HERE!) Ensure your code contains comments!! Specifically, any line of code that you do not immediately and like muscle-memory understand should have a comment. This helps you to develop the skill of commenting and understand what your own code does.
Don’t make this harder than it has to be. Every type of plot you have to make on this homework was introduced in lecture and/or the ggplot2
interactive exercises. Unless the question explicitly teaches you new code, you have seen the code you need to use before.
You can place aes()
wherever you want, as long as the plot works! You can include aesthetics on their own, within the ggplot()
call, or within the relevant geom
function. There is lots of flexibility for how you code aesthetic mappings, so use this opportunity to explore your coding style preference.
The setup
R chunk in the RMarkdown template pre-specifies a default figure size of 6 inches wide and 4 inches tall. Don’t interfere with this setting!! The plots you need to recreate were also made to be 6 inches wide and 4 inches tall.
Hints:
- See how “Plot 1” is written twice? The first big/bold one is the question header for the homework, and the second one is part of the plot. All plots in this homework are structured like this.
- Indeed that is a trendline you see! For this trendline only, uniquely on the homework, make sure you match its specific color. You don’t have to match other colors exactly. (The color is
"black"
).
Hint: You should use the argument
bins
with your geom to exactly match this plot. Count the number of bins and use that number!
Hints:
- Notice that both distributons are visible even though they are overlapping. Make sure yours also features transparency!
- This plot places the legend at the bottom of the plot. This is actually a modification to the plot’s theme. Because
ggplot
plots are just added components on top of one another, you will need to change the legend position AFTER you set its theme (think about why this MUST BE THE CASE, and experiment with it!). The legend can be re-positioned by adding on the following code:theme(legend.position = "bottom")
. To learn more (but not to submit for this questoin!), see what happens if you use “top”, “left”, or “right” instead of “bottom”.
- Make sure your order of variables agrees with this plot!! You will need to use the function
fct_relevel()
, which was introduced in theggplot2
exercises. Learn more by asking theintroverse
for help:get_help("fct_relevel")
!!- This plot features both a fill and color!
Hints:
- See the black point with small lines in each distribution? This represents the mean and standard error (SE) of each distribution. In this case, the standard errors are very small, so the lines are a bit tricky to see, but they are there! These points are conveniently, amazingly, and automatically added with the plot component,
stat_summary()
. This is a specialggplot2
function which can easily add a summary statistic onto a plot. No arguments are necessary because by default, it plots mean and SE.- It will reveal a warning message: “No summary function supplied, defaulting to
mean_se()
,” which means you used it CORRECTLY!!!!!- You will see the black point is larger than the other points. You should also make sure your plot has these relative sizes, but the exact numbers can differ from mine.
- You will want to use the argument
width
with your geom. Your value does not need to exactly match that used to make the plot, but it should definitely differ from the default.
Hint: These points have both a color and fill!
Hint: Carefully consider the aesthetic mappings (
aes()
), and you will figure this out. It’s all about the mappings!! Think…What is on the X? What is on the Y? Is there a color or fill (hint: there’s a fill!)? Is that color/fill “just a fill” or mapped to a variable (hint: it’s mapped!)? Amazingly, this information alone is enough to get you almost all the way to the finish line, other than theme and labels. The grammar of graphics is amazing!!
Hint: Compare this plot to the last one, and don’t overthink!! The code is almost exactly identical.
Hint: Count the bins!