How to read Excel files in R

Technique 1: Use `read_excel()` from `readx1`

The read_excel() function is used to read an Excel file, and it is only accessible after you import the readxl library in the R program.

Syntax

read_excel(
  path,
  sheet = 1,
  col_names = TRUE,
  col_types = NULL,
  na = "",
  skip = 0
)

Parameters

path: The directory of the Excel file with extension.
sheet: Default= 1. If we have more than 1 sheet in a single workbook, we can assign sheet #, i.e., 1, 2, 3, 4…
col_names: Default= TRUE.
1. TRUE: Read the first row as a file header.
2. FALSE: The file does not have a header.
3. Character Vector: Character vector that contains header names.
col_types: Default= NULL.
1. NULL: Interpret the type of column from the spreadsheet, i.e., Date, Currency, etc.
2. Vector: A vector that contains "blank", "numeric", "date", or "text".
na: Default= "". Keeps empty cells as is, but we can specify na to fill empty cells.
skip: Default= 0. Number of rows to skip from start before reading a file.

Return value

This method returns Excel file data as a data frame.

Technique 2: Use `read.xlsx()` from xlsx

The read.xlsx() function is imported from the xlsx library, which is used to read an Excel file in R language.

Syntax

read.xlsx(
  path,
  sheet,
  startRow = 1,
  colNames = TRUE,
  rowNames = FALSE,
  detectDates = FALSE,
  skipEmptyRows = TRUE,
  skipEmptyCols = TRUE,
  rows = NULL,
  cols = NULL,
  check.names = FALSE,
  sep.names = ".",
  namedRegion = NULL,
  na.strings = "NA",
  fillMergedCells = FALSE
)

Parameters

path: Directory of file including extension.
sheet: Index or name of the sheet to read data.
startRow*: Default= 1`. The first row to begin and check for data.
colNames: Default= TRUE. If True, the first row will be selected as column names.
rowNames: Default= FALSE. If True, the first column will be selected as row names.
detectDates: Default= FALSE. If True, this method will try to recognize Dates and perform conversions if needed.
skipEmptyRows: Default= TRUE. If True, empty rows will be skipped during loading. After the first row containing data, other empty rows will be returned as NAs.
skipEmptyCols: Default= TRUE. If True, empty columns will be skipped during loading.
rows: Default= NULL.
- NULL: If rows are set to NULL, then all rows are read.
- Vector: If rows are set to a vector of numeric values, then specified rows will be read.
cols: Default= NULL.
- NULL: If cols is set to NULL, then all columns are read.
- Vector: If cols is set to vector of numeric values, then specified columns will be read.
check.names: Default= FALSE. If True, then it will check if each variable is syntactically valid or not.
sep.names: Default= ".". Separates each variable in column names.
namedRegion: Default= NULL.
- NULL: Values of rows, cols, and startRow parameters will be considered within a workbook.
- Not NULL: Values of rows, cols, and startRow parameters are ignored.
na.strings: Default= "NA". Returns blank cells as NAs.
fillMergedCells: Default= FALSE. If True, then the value in a merged cell (Parent) is given to all cells within the merge (Child).