R
Topic | Description |
---|---|
as.character() |
Coerce (“change”) a value to be character type |
as.factor() |
Coerce (“change”) a value to be factor type |
as.numeric() |
Coerce (“change”) a value to be character type |
class() |
Determine a value’s type |
Function | Description |
---|---|
library() |
Load a library/package into an R session |
install.packages() |
Install a library/package for the first time |
Function | Description |
---|---|
getwd() |
Ask R what your R session’s working directory is |
setwd() |
Change the current R session’s working directory |
dir.exists() |
Ask if a given directory (folder) exists on your computer |
file.exists() |
Ask if a given file exists on your computer |
file.path() |
Construct a path to a given file or directory (folder) |
Function | Description |
---|---|
log() |
Calculate the natural log (or other specified base) of a number or array of numbers |
sqrt() |
Calculate the square root of a number or array of numbers |
abs() |
Calculate the absolute value of a number or array of numbers |
round() |
Round a number or array of numbers to a specified number of decimal places |
ceiling() |
Round a number or array of numbers up to the next highest integer |
floor() |
Round a number or array of numbers down to the next lowest integer |
Function | Description |
---|---|
c() |
Create new arrays |
length() |
Determine the length of an array (not of a single string) |
nchar() |
Determine the number of characters in a string |
levels() |
Determine the levels (i.e., ordered categories) of a factor variable |
unique() |
Remove duplicates from an array, i.e. keep only unique values |
paste() and paste0() |
Combine several strings into one |
ifelse() |
Assign a value based on whether a statement is TRUE or FALSE
|
cat() and print |
Functions you can use to explicitly print |
all() |
Asks, “do ALL values in an array meet a logical condition?” |
any() |
Asks, “do ANY (at least one) values in an array meet a logical condition?” |
Function | Description |
---|---|
data.frame() |
Create a new data frame from scratch |
nrow() |
Determine the number of rows in a tibble (data frame) |
ncol() |
Determine the number of columns in a tibble (data frame) |
names() |
Determine the column names of a tibble (data frame) |
head() |
See the first 6 rows of a tibble (data frame) |
tail() |
See the last 6 rows of a tibble (data frame) |
summary() |
See a summary of columns in tibble (data frame) |
str() |
See the structure of a tibble (data frame) |
Function | Description |
---|---|
mean() |
Calculate the average of an array of numbers |
median() |
Calculate the median of an array of numbers |
max() |
Calculate the maximum value of an array of numbers |
min() |
Calculate the minimum value of an array of numbers |
sd() |
Calculate the standard deviation of an array of numbers |
sum() |
Calculate the sum of an array of numbers |
summary() |
Calculate several summary statistics for an array of numbers |
table() |
Count the occurrences of each value in an array of any type |
ggplot2
ggplot2
Function | Description |
---|---|
aes() |
Provide a ggplot2 with aesthetic mappings from dataset columns |
ggplot() |
Tell ggplot2 which dataset to plot and establish a ggplot2 canvas |
ggsave() |
Save a plot made with ggplot2 to a file |
Function | Description |
---|---|
geom_boxplot() |
Create boxplots |
geom_density() |
Create density plots |
geom_histogram() |
Create histograms |
geom_jitter() |
Create jitter (strip) plots |
geom_point() |
Create point shapes, often (but not always) to create scatterplots |
geom_smooth() |
Add a trendline to a plot, often (but not always) to a scatterplot |
geom_col() |
Create bar plots whose height corresponds to literal values in the data |
geom_bar() |
Create bar plots whose height corresponds to counted number of observations of a categorical variable |
geom_text() and geom_label() |
Create labels in plots |
geom_segment() |
Add line segments to plots |
Function or tutorial | Description |
---|---|
Axes tutorial | Customizing x- and y- axes |
xlim() |
Change x-axis limits |
ylim() |
Change y-axis limits |
labs() |
Customize plot labels, including axes, titles, and legends |
facet_wrap() |
Create a faceted (paneled) plot across 1 variable |
facet_grid() |
Create a faceted (paneled) grid of plots across 2 variables |
Function or tutorial | Description |
---|---|
Color and fill scales tutorial | Customizing color and fill scales |
scale_color_manual() |
Specify custom color scale mappings for discrete data |
scale_fill_manual() |
Specify custom fill scale mappings for discrete data |
scale_color_gradient() |
Specify custom color gradient scale mappings for continuous data |
scale_fill_gradient() |
Specify custom fill gradient scale mappings for continuous data |
scale_color_gradient2() |
Specify custom two-way color gradient scale mappings for continuous data |
scale_fill_gradient2() |
Specify custom two-way fill gradient scale mappings for continuous data |
scale_color_brewer() |
Specify colorbrewer color scale mappings for discrete data |
scale_fill_brewer() |
Specify colorbrewer fill scale mappings for discrete data |
scale_color_distiller() |
Specify colorbrewer color scale mappings for continuous data |
scale_fill_distiller() |
Specify colorbrewer fill scale mappings for continuous data |
scale_color_viridis_d() |
Specify viridis color scale mappings for discrete data |
scale_fill_viridis_d() |
Specify viridis fill scale mappings for discrete data |
scale_color_viridis_c() |
Specify viridis color scale mappings for continuous data |
scale_fill_viridis_c() |
Specify viridis fill scale mappings for continuous data |
Link | Description |
---|---|
scale_shape_manual() |
Specify custom shape mappings |
scale_size_manual() |
Specify custom size mappings |
scale_alpha_manual() |
Specify custom alpha (transparency) mappings |
scale_linetype_manual() |
Specify custom linetype mappings |
Function | Description |
---|---|
theme() and associated tutoral |
Customizing themes |
theme_set() |
Set the default theme |
theme_gray() |
Specify the built-in “gray” (default) theme |
theme_grey() |
Specify the built-in “grey” (default) theme |
theme_bw() |
Specify the built-in “bw” theme |
theme_linedraw() |
Specify the built-in “linedraw” theme |
theme_light() |
Specify the built-in “light” theme |
theme_dark() |
Specify the built-in “dark” theme |
theme_minimal() |
Specify the built-in “minimal” theme |
theme_classic() |
Specify the “classic” theme |
theme_void() |
Specify the built-in “void” theme |
dplyr
Function | Description |
---|---|
arrange() |
Arrange tibble (data frame) rows |
distinct() |
Remove duplicate rows from a tibble (data frame) |
mutate() |
Create new or modify existing columns in a tibble (data frame) |
filter() |
Keep only certain rows from a tibble (data frame) based on TRUE or FALSE
|
select() |
Keep, remove, or reorder columns in a tibble (data frame) |
rename() |
Rename existing columns in a tibble (data frame) |
glimpse() |
See an overview of tibble (data frame) contents |
pull() |
Extract out a column from a tibble (data frame) into its own array |
slice() |
Keep only certain rows from a tibble (data frame) based on index |
group_by() |
Establish a grouping in a tibble (data frame) |
ungroup() |
Undo any groupings in a tibble (data frame) |
summarize() |
Summarize values in a tibble (data frame) to produce a smaller summarized tibble |
tally() |
Count the number of rows in each grouping of a tibble (data frame) |
count() |
Simultaneously group and count the number of rows in each grouping of a tibble (data frame) |
Function | Description |
---|---|
bind_cols() |
Combine arrays or tibbles (data frames) by columns |
bind_rows() |
Combine arrays or tibbles (data frames) by rows |
left_join() |
Merge two relational tibbles (data frames) based on the left side |
right_join() |
Merge two relational tibbles (data frames) based on the right side |
full_join() |
Merge two relational tibbles (data frames) fully |
inner_join() |
Merge two relational tibbles (data frames) based on what they have in common |
anti_join() |
Merge relational tibbles (data frames) based on what they do not have in common |
Function | Description |
---|---|
if_else() |
Assign a value based on whether a statement is TRUE or FALSE , ensuring same type |
case_when() |
Assign a value based on a series of conditions are TRUE or FALSE
|
n() |
Returns the number of rows in a tibble group |
tidyselect
helpers
Function | Description |
---|---|
everything() |
Select all columns, except those already explicitly stated |
contains() |
Select all columns that contain a given string argument |
starts_with() |
Select all columns that start with a given string, i.e. prefix |
ends_with() |
Select all columns that end with a given string, i.e. suffix |
last_col() |
Select the last column |
matches() |
Select all columns that match a regular expression (a special type of pattern-matching string) |
forcats
Function | Description |
---|---|
fct_relevel() |
Reorder levels in a factor to a custom order |
fct_infreq() |
Reorder levels factor in a factor based on frequency |
fct_rev() |
Reverse the order of levels in a factor |
fct_reorder() |
Reorder levels in a factor based on values of another variable |
fct_lump_n() |
Combine infrequently occurring levels of a factor into one |
fct_lump_min() |
Combine infrequently occurring levels of a factor into one |
fct_lump_prop() |
Combine infrequently occurring levels of a factor into one |
readr
Function | Description |
---|---|
read_csv() |
Read a CSV file into R
|
read_tsv() |
Read a TSV file into R
|
read_csv2() |
Read a CSV2 file into R
|
read_delim() |
Read a delimited file of any kind into R
|
write_csv() |
Write (save) a tibble (data frame) to a CSV file |
write_tsv() |
Write (save) a tibble (data frame) to a TSV file |
write_csv2() |
Write (save) a tibble (data frame) to a CSV2 file |
write_delim() |
Write (save) a tibble (data frame) to a delimited file |
tibble
Function | Description |
---|---|
as_tibble() |
Coerce a variable (often a data frame) into a tibble |
tibble() |
Create a new tibble from scratch |
tribble() |
Create a new tibble from scratch |
tidyr
Function | Description |
---|---|
pivot_longer() |
Make a tibble (data frame) longer |
pivot_wider() |
Make a tibble (data frame) wider |
drop_na() |
Remove rows containing NA values from a tibble (data frame) |
unite() |
Unite (combine) two columns in a tibble (data frame) into a single column |
separate() |
Separate a single column in a tibble (data frame) into two new columns |
stringr
Function | Description |
---|---|
str_count() |
Count the number of occurrences of a substring or regular expression in a string |
str_replace() |
Replace the first occurrence of a substring or regular expression in a string with a given replacement string |
str_replace_all() |
Replace all occurrences of a substring or regular expression in a string with a given replacement string |
str_detect() |
Determine whether it is TRUE or FALSE that a given string contains a substring |
str_starts() |
Determine whether it is TRUE or FALSE that a given string starts with a given substring |
str_ends() |
Determine whether it is TRUE or FALSE that a given string ends with a given substring |
glue
Function | Description |
---|---|
glue() |
Quickly combine strings and variables into a single string |