The separate()
function from the tidyr
package can be used to separate a single data frame or table column into multiple columns. We can now separate out that information into multiple other columns, as shown below.
The tidy data library
tidyr
is required for theseparate()
function.
separate(
data,
col,
into,
sep = "[^[:alnum:]]+",
remove = TRUE,
convert = FALSE
)
It takes multiple parameters, as listed below:
data
: The frame of interest that is involved.column
: A column that is to be separated.into
: The names of columns that are used for the data to be separated.sep
: This is the value to separate the data. default = _[^[:alnum:]]+_
, regular expression.remove
: If set to TRUE
, remove input column(s) from the output data frame. Default = TRUE
.convert
: This is used for datatype conversions. Default = FALSE
.In the example below, we have a data frame containing customer name, age, and contact number with area code. Our goal is to use the separate()
function to split the Contact
column into two sub-columns, Area Code
and Phone
.
# Load librarylibrary(tidyr)#create data framedf <- data.frame(Customer=c('Allen', 'Tolinton', 'Brusher', 'Dominique'),Age=c(23, 25, 34, 29),Contact=c('209-71953650312', '408-5182774863', '18-9564277497', '11-8946428747'))# col= Contact a column which we want to separate# Sinle column will be divided# into --> Area Code and Phoneseparate(df, col=Contact, into=c('Area Code', 'Phone'), sep='-')