Obtain the homework template from your RStudio Cloud class project by running the following code in the R Console:
library(ds4b.materials) # Load the class library
launch_homework(8) # Launch Homework 8
You must set an RMarkdown theme and code syntax highlighting scheme of your choosing in the YAML front matter. These links will help you:
Make sure your Rmd knits without errors before submitting. If it does not produce an HTML output, this means it does not knit. DO NOT SKIP THIS STEP! Ensuring code runs without errors is MORE IMPORTANT than writing code in the first place.
As always, you are encouraged to work together and use the class Slack to help each other out, but you must submit YOUR OWN CODE.
In the top chunk named read_data
, first…
coffee_ratings
dataset directly from the URL where it lives on the internet (not from the file). This is done for you.total_cup_points != 0
, and save the data to clean_coffee_ratings
.You should use this tibble, clean_coffee_ratings
, for the whole homework. Feel free to change the variable name if you like as long as it remains informative and not the same as the raw data coffee_ratings
.You will be conducting exploratory data analysis on the coffee ratings dataset by answering five questions about the coffee data using a combination of wrangling and visualization. Three of these questions are asked for you, and you ask the other two questions. Each question needs at least some wrangling with one or more dplyr
and/or tidyr
functions, its own plot (although question 3 has two plots!), and a brief answer in 1-3 sentences. Please follow the given template to organize your code (as you did in Homework #5). You can style your plots however you want as long as you ensure professional labeling (no underscores!) of all axes and legend titles.
Importantly, for every written answer you give there MUST BE corresponding code. For the first three questions templated for you, you must conduct calculations as described.
Finally, YOU MUST SPELL CHECK. Seriously.
Question 1
processing_method
and sd_uniformity
.NA
processing methodsdplyr
verbs.Question 2
country_of_origin
and number_robusta
ggwaffle
with this code (copy/paste into Console): remotes::install_github("liamgilbey/ggwaffle")
.dplyr
verbs.Question 3
NA
) colors. This tibble should have two columns named color
and mean_moisture
.NA
colors, and the second should show the the literal mean values of each moisture distribution (hint: barplot!). You can make both figures in the same chunk, or create an additional (named!) chunk as needed.dplyr
verbs.geom_col()
! Think about why this geom!).moisture
across (non-NA
) categories of color
. It has nothing to do with the means that were necessary to calculate for making the first plot.