tidyr::pivot_longer()
get_help()
docs
The pivot_longer()
function is part of the {tidyr}
package, which is part of the {tidyverse}
.
We use this function to convert a wide tibble (data frame) into a long tibble (data frame).
When using pivot_longer()
, you must specify the columns to be pivoted, which may require the use of some {tidyselect}
helper functions. To get help with {tidyselect}
from the {introverse}
, use show_topics("tidyselect")
to see available docs.
To use this function, you need to either first load the {tidyr}
library, or always use the function with tidyr::pivot_longer()
notation.
# Load the library
library(tidyr)
# Or, load the full tidyverse:
library(tidyverse)
# Or, use :: notation
::pivot_longer() tidyr
%>%
tibble pivot_longer(columns to pivot from wide to long,
names_to = "name of new column that will contain the old column names",
values_to = "name of new column that will contain the values from the old columns")
This example use the wide dataset billboard
, which comes from the {tidyr}
package and contains song rankings for the Billboard top 100 songs in the year 2000.
# Show billboard dataset
billboard
## # A tibble: 317 × 79
## artist track date.entered wk1 wk2 wk3 wk4 wk5 wk6 wk7 wk8 wk9 wk10 wk11
## <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 Pac Baby… 2000-02-26 87 82 72 77 87 94 99 NA NA NA NA
## 2 2Ge+her The … 2000-09-02 91 87 92 NA NA NA NA NA NA NA NA
## 3 3 Door… Kryp… 2000-04-08 81 70 68 67 66 57 54 53 51 51 51
## 4 3 Door… Loser 2000-10-21 76 76 72 69 67 65 55 59 62 61 61
## 5 504 Bo… Wobb… 2000-04-15 57 34 25 17 17 31 36 49 53 57 64
## 6 98^0 Give… 2000-08-19 51 39 34 26 26 19 2 2 3 6 7
## 7 A*Teens Danc… 2000-07-08 97 97 96 95 100 NA NA NA NA NA NA
## 8 Aaliyah I Do… 2000-01-29 84 62 51 41 38 35 35 38 38 36 37
## 9 Aaliyah Try … 2000-03-18 59 53 38 28 21 18 16 14 12 10 9
## 10 Adams,… Open… 2000-08-26 76 76 74 69 68 67 61 58 57 59 66
## # … with 307 more rows, and 65 more variables: wk12 <dbl>, wk13 <dbl>, wk14 <dbl>, wk15 <dbl>,
## # wk16 <dbl>, wk17 <dbl>, wk18 <dbl>, wk19 <dbl>, wk20 <dbl>, wk21 <dbl>, wk22 <dbl>,
## # wk23 <dbl>, wk24 <dbl>, wk25 <dbl>, wk26 <dbl>, wk27 <dbl>, wk28 <dbl>, wk29 <dbl>,
## # wk30 <dbl>, wk31 <dbl>, wk32 <dbl>, wk33 <dbl>, wk34 <dbl>, wk35 <dbl>, wk36 <dbl>,
## # wk37 <dbl>, wk38 <dbl>, wk39 <dbl>, wk40 <dbl>, wk41 <dbl>, wk42 <dbl>, wk43 <dbl>,
## # wk44 <dbl>, wk45 <dbl>, wk46 <dbl>, wk47 <dbl>, wk48 <dbl>, wk49 <dbl>, wk50 <dbl>,
## # wk51 <dbl>, wk52 <dbl>, wk53 <dbl>, wk54 <dbl>, wk55 <dbl>, wk56 <dbl>, wk57 <dbl>, …
# Use pivot_longer() and the tidyselect helper starts_with to pivot all columns that start with "wk"
%>%
billboard pivot_longer(starts_with("wk"),
names_to = "week",
values_to = "song_ranking")
## # A tibble: 24,092 × 5
## artist track date.entered week song_ranking
## <chr> <chr> <date> <chr> <dbl>
## 1 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk1 87
## 2 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk2 82
## 3 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk3 72
## 4 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk4 77
## 5 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk5 87
## 6 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk6 94
## 7 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk7 99
## 8 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk8 NA
## 9 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk9 NA
## 10 2 Pac Baby Don't Cry (Keep... 2000-02-26 wk10 NA
## # … with 24,082 more rows
This example example uses the dataset tb_cases_wide
, which is a wide tibble showing how many cases of Tuberculosis were recorded in each country in the given year.
tb_cases_wide
## # A tibble: 3 × 3
## country `1999` `2000`
## * <chr> <int> <int>
## 1 Afghanistan 745 2666
## 2 Brazil 37737 80488
## 3 China 212258 213766
# Pivot tb_cases longer.
# Use backticks `` or quotes "" to refer to columns whose names start with a number
%>%
tb_cases_wide pivot_longer(`1999`:`2000`, # we want to pivot columns 1999 and 2000
names_to = "year", # this column will contain 1999 and 2000
values_to = "number_of_tb_cases") # will contain information previously in columns 1999 and 2000
## # A tibble: 6 × 3
## country year number_of_tb_cases
## <chr> <chr> <int>
## 1 Afghanistan 1999 745
## 2 Afghanistan 2000 2666
## 3 Brazil 1999 37737
## 4 Brazil 2000 80488
## 5 China 1999 212258
## 6 China 2000 213766