tidyr::separate()
get_help()
docs
The separate()
function is part of the {tidyr}
package, which is part of the {tidyverse}
.
We use separate()
to separate a column in a tibble (data frame) into two columns. It is commonly used to tidy a dataset when multiple variables are stored in one instead of two columns. The function is usually good at detecting how to separate the columns, but you can also specify with the sep
argument. By default, the newly created columns are always character type, which can be changed as shown in the Examples below.
To use this function, you need to either first load the {tidyr}
library, or always use the function with tidyr::separate()
notation.
# Load the library
library(tidyr)
# Or, load the full tidyverse:
library(tidyverse)
# Or, use :: notation
::separate() tidyr
%>%
tibble separate(column to separate,
into = c("first new column", "second new column"))
%>%
tibble separate(column to separate,
into = c("first new column", "second new column"),
sep = "character to separate columns on in case `separate()` guesses poorly")
The first two examples use this dataset, which contains some names of notable biologists:
biologists
## # A tibble: 3 × 1
## full_name
## <chr>
## 1 Rosalind Franklin
## 2 Lynn Margulis
## 3 Barbara McClintock
# Separate `full_name` into `first_name` and `last_name`
%>%
biologists separate(full_name, into = c("first_name", "last_name"))
## # A tibble: 3 × 2
## first_name last_name
## <chr> <chr>
## 1 Rosalind Franklin
## 2 Lynn Margulis
## 3 Barbara McClintock
# Separate `full_name` into `first_name` and `last_name`, and KEEP `full_name`
%>%
biologists separate(full_name, into = c("first_name", "last_name"),
remove = FALSE)
## # A tibble: 3 × 3
## full_name first_name last_name
## <chr> <chr> <chr>
## 1 Rosalind Franklin Rosalind Franklin
## 2 Lynn Margulis Lynn Margulis
## 3 Barbara McClintock Barbara McClintock
The next two examples use this dataset, which contains some (made-up) prices of foods:
food_prices
## # A tibble: 3 × 2
## food price
## <chr> <dbl>
## 1 banana 1.15
## 2 pomegranate 3.85
## 3 avocado 2.2
# Separate `price` into `dollars` and `cents`
%>%
food_prices separate(price, into = c("dollars", "cents"))
## # A tibble: 3 × 3
## food dollars cents
## <chr> <chr> <chr>
## 1 banana 1 15
## 2 pomegranate 3 85
## 3 avocado 2 2
# Separate `price` into `dollars` and `cents`,
# AND ensure new columns are properly made into numerics
%>%
food_prices separate(price, into = c("dollars", "cents"),
convert = TRUE)
## # A tibble: 3 × 3
## food dollars cents
## <chr> <int> <int>
## 1 banana 1 15
## 2 pomegranate 3 85
## 3 avocado 2 2