dplyr::filter()
get_help() docs
The filter() function is part of the {dplyr} package, which is part of the {tidyverse}.
We use this function to subset rows from tibbles (data frames). Rows are kept if the given logical statement (code that gives TRUE or FALSE) is TRUE. Use this function if you only want to keep certain rows, aka certain observations.”
To use this function, you need to either first load the {dplyr} library, or always use the function with dplyr::filter() notation.
# Load the library
library(dplyr)
# Or, load the full tidyverse:
library(tidyverse)
# Or, use :: notation
dplyr::filter()tibble %>%
filter(logical statement)
tibble %>%
filter(logical statement,
another logical statement,
and more logical statements that you want to all be TRUE)The examples below use the carnivores dataset. Learn more about this dataset with get_help("carnivores").
# Show the carnivores dataset
carnivores## # A tibble: 9 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Arctic fox Vulpes 11.5 0.0445
## 2 Cheetah Acinonyx 11.9 NA
## 3 Dog Canis 13.9 0.07
## 4 Gray seal Haliochoerus 17.8 0.325
## 5 Jaguar Panthera 13.6 0.157
## 6 Lion Panthera 10.5 NA
## 7 Northern fur seal Callorhinus 15.3 NA
## 8 Red fox Vulpes 14.2 0.0504
## 9 Tiger Panthera 8.2 NA
# Subset carnivores to keep only rows that are the genus 'Panthera'
carnivores %>%
filter(genus == "Panthera")## # A tibble: 3 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Jaguar Panthera 13.6 0.157
## 2 Lion Panthera 10.5 NA
## 3 Tiger Panthera 8.2 NA
# Subset carnivores to keep only rows of carnivores awake for more than 13 hours
carnivores %>%
filter(awake > 13)## # A tibble: 5 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Dog Canis 13.9 0.07
## 2 Gray seal Haliochoerus 17.8 0.325
## 3 Jaguar Panthera 13.6 0.157
## 4 Northern fur seal Callorhinus 15.3 NA
## 5 Red fox Vulpes 14.2 0.0504
# Subset carnivores to keep only rows where the genus is 'Panthera' and they are awake for _less than_ 13 hours
carnivores %>%
filter(genus == "Panthera", awake < 13)
## # A tibble: 2 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Lion Panthera 10.5 NA
## 2 Tiger Panthera 8.2 NA
# Or, use & as "and"
carnivores %>%
filter(genus == "Panthera" & awake < 13)
## # A tibble: 2 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Lion Panthera 10.5 NA
## 2 Tiger Panthera 8.2 NA# Subset rows to keep only where genus is 'Panthera' **or** brainwt < 0.05
# Or, use | (vertical bar on \ key) as "or"
carnivores %>%
filter(genus == "Panthera" | brainwt < 0.05)## # A tibble: 4 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Arctic fox Vulpes 11.5 0.0445
## 2 Jaguar Panthera 13.6 0.157
## 3 Lion Panthera 10.5 NA
## 4 Tiger Panthera 8.2 NA
# Subset carnivores to keep only rows where the genus is _either_ 'Panthera' or 'Canis'
# In other words, it should be TRUE that the genus is %in% the array c("Panthera", "Canis").
# Remember: We do _not_ use `==` when asking if a value is in an array.
carnivores %>%
filter(genus %in% c("Panthera", "Canis"))
## # A tibble: 4 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Dog Canis 13.9 0.07
## 2 Jaguar Panthera 13.6 0.157
## 3 Lion Panthera 10.5 NA
## 4 Tiger Panthera 8.2 NA
# Or, use | (vertical bar on \ key) as "or"
carnivores %>%
filter(genus == "Panthera" | genus == "Canis")
## # A tibble: 4 × 4
## name genus awake brainwt
## <chr> <fct> <dbl> <dbl>
## 1 Dog Canis 13.9 0.07
## 2 Jaguar Panthera 13.6 0.157
## 3 Lion Panthera 10.5 NA
## 4 Tiger Panthera 8.2 NA