forcats::fct_infreq()
   get_help() docs


Description

The fct_infreq() function is part of the {forcats} package, which is part of the {tidyverse}.

We use the fct_infreq() function to quickly reorder of categories (levels) in a factor variable based on how many times each level appears. The most commonly appearing level will be first, and the least-commonly appear level will be last.

Changing the order of factor levels is commonly performed to change axis order of a factor variable when using plotting with the {ggplot2} library.

To use this function, you need to either first load the {forcats} library, or always use the function with forcats::fct_infreq() notation.

# Load the library
library(forcats)
# Or, load the full tidyverse:
library(tidyverse)

# Or, use :: notation
forcats::fct_infreq()

Conceptual Usage

fct_infreq(factor variable to change order of)

Examples

The examples below use a modified version of the msleep dataset called msleep_fctvore. Learn more about this dataset with get_help("msleep").

In this modified dataset, the vore column has been coerced into a factor type (instead of character), and all NA values have been removed from that column. (Notice below, the vore column is annotated <fct> since it’s a factor).

# Show the modified msleep dataset, msleep_fctvore, with head()
head(msleep_fctvore)
## # A tibble: 6 × 11
##   name  genus vore  order conservation sleep_total sleep_rem sleep_cycle awake  brainwt  bodywt
##   <chr> <chr> <fct> <chr> <chr>              <dbl>     <dbl>       <dbl> <dbl>    <dbl>   <dbl>
## 1 Owl … Aotus omni  Prim… <NA>                17         1.8      NA       7    0.0155    0.48 
## 2 Moun… Aplo… herbi Rode… nt                  14.4       2.4      NA       9.6 NA         1.35 
## 3 Grea… Blar… omni  Sori… lc                  14.9       2.3       0.133   9.1  0.00029   0.019
## 4 Cow   Bos   herbi Arti… domesticated         4         0.7       0.667  20    0.423   600    
## 5 Thre… Brad… herbi Pilo… <NA>                14.4       2.2       0.767   9.6 NA         3.85 
## 6 Nort… Call… carni Carn… vu                   8.7       1.4       0.383  15.3 NA        20.5

# To help guide you through examples, use the base R function table() count the number of each vore 
# This shows us which categories are more/less frequent
table(msleep_fctvore$vore)
## 
##   carni   herbi insecti    omni 
##      10      24       4      18


# Use dplyr::mutate() to reorder the order of `vore` levels according to frequency
msleep_fctvore %>%
  mutate(vore = fct_infreq(vore)) -> msleep_fctvore_ex1

# Show new levels to confirm they are updated
levels(msleep_fctvore_ex1$vore)
## [1] "herbi"   "omni"    "carni"   "insecti"


# Below is shown a barplot of vore that uses the default vore levels,
# This way you can compare with the next plot that changes the levels with `fct_infreq()`.
ggplot(msleep_fctvore) +
  aes(x = vore) + 
  geom_bar() 

# Without re-writing the column, change the levels for _plotting purposes only_
# Provide fct_infreq(VARIABLE) to ggplot2::aes() to order in your plot
# This affects the x-axis labeling, so it is best practice to clean up with `labs()`
ggplot(msleep_fctvore) +
  aes(x = fct_infreq(vore)) + 
  geom_bar() + 
  labs(x = "vore")


# Without re-writing the column, change the levels for _plotting purposes only_
# Provide fct_infreq(VARIABLE) to ggplot2::aes() to order in your plot
# This affects the x-axis labeling, so it is best practice to clean up with `labs()`
ggplot(msleep_fctvore) +
  aes(x = fct_infreq(vore), 
      y = awake) + 
  geom_boxplot() + 
  labs(x = "vore")