dplyr::distinct()
   get_help() docs


Description

The distinct() function is part of the {dplyr} package, which is part of the {tidyverse}.

We use this function to remove duplicate rows from tibbles (data frames). Only distinct rows are retained. No arguments are needed when using this fucntion.

To use this function, you need to either first load the {dplyr} library, or always use the function with dplyr::distinct() notation.

# Load the library
library(dplyr)
# Or, load the full tidyverse:
library(tidyverse)

# Or, use :: notation
dplyr::distinct()

Conceptual Usage

tibble %>% 
  distinct()

Examples

Some examples below use the carnivores dataset. Learn more about this dataset with get_help("carnivores").

# Show the carnivores dataset
carnivores
## # A tibble: 9 × 4
##   name              genus        awake brainwt
##   <chr>             <fct>        <dbl>   <dbl>
## 1 Arctic fox        Vulpes        11.5  0.0445
## 2 Cheetah           Acinonyx      11.9 NA     
## 3 Dog               Canis         13.9  0.07  
## 4 Gray seal         Haliochoerus  17.8  0.325 
## 5 Jaguar            Panthera      13.6  0.157 
## 6 Lion              Panthera      10.5 NA     
## 7 Northern fur seal Callorhinus   15.3 NA     
## 8 Red fox           Vulpes        14.2  0.0504
## 9 Tiger             Panthera       8.2 NA


# Keep only distinct (unique) rows in carnivores
# All rows are already distinct, so all rows remain
carnivores %>% 
  distinct()
## # A tibble: 9 × 4
##   name              genus        awake brainwt
##   <chr>             <fct>        <dbl>   <dbl>
## 1 Arctic fox        Vulpes        11.5  0.0445
## 2 Cheetah           Acinonyx      11.9 NA     
## 3 Dog               Canis         13.9  0.07  
## 4 Gray seal         Haliochoerus  17.8  0.325 
## 5 Jaguar            Panthera      13.6  0.157 
## 6 Lion              Panthera      10.5 NA     
## 7 Northern fur seal Callorhinus   15.3 NA     
## 8 Red fox           Vulpes        14.2  0.0504
## 9 Tiger             Panthera       8.2 NA


# This example creates a tibble with non-unique rows for more demonstration:

# Make a tibble with 3 rows using `tibble::tribble()` 
new_tibble <- tibble::tribble(
  ~col1, ~col2,
  1, 5, 
  1, 5, 
  2, 6, 
)

# Show new_tibble
new_tibble
## # A tibble: 3 × 2
##    col1  col2
##   <dbl> <dbl>
## 1     1     5
## 2     1     5
## 3     2     6
# Keep only unique rows from new_tibble
new_tibble %>%
  distinct()
## # A tibble: 2 × 2
##    col1  col2
##   <dbl> <dbl>
## 1     1     5
## 2     2     6