In Julia, reading CSV
package. This package provides high-performance CSV parsing and writing functionality.
CSV
packageTo use the CSV
package, we can start installing it by running the following command in the Julia Read-Eval-Print Loop (REPL):
using Pkg
Pkg.add("CSV")
Note: We have already installed the
CSV
package on our platform.
read
functionOnce the package is installed, we can read a CSV file using the CSV.read
function, which takes in the file path as a string and returns a DataFrame
object:
using CSV, DataFrames
data = CSV.read("abc.csv", DataFrame)
By default, CSV.read
assumes that the first row of the CSV file contains the header names. If our file doesn't have a header row, we can set the header
argument to false
:
data = CSV.read("abc.csv", DataFrame, header=false)
We can also specify the delimiter character using the delim
argument. For example, to read a tab-separated file, we can set delim='\t'
as follows:
data = CSV.read("abc.csv", DataFrame, header=false, delim='\t')
To read in specific columns of the CSV file, we can use the select
argument of the CSV.read
function. For example, to read only the first and third columns of a file, we can do the following:
data = CSV.read("abc.csv", DataFrame, header=false, select=[1, 3])
If our CSV file has missing values represented by a specific string (e.g., “NA”), we can set the missingstring
argument to the corresponding value. For example, to treat “NA” as missing values, we can do the following:
data = CSV.read("abc.csv", missingstring="NA")
This will convert any occurrences of NA
in the CSV file to the Julia missing value.
Matrix
functionOnce we have read the data as a DataFrame
, we can convert it to an array using the Matrix
function:
array_data = Matrix(data)
This will give us the exact dimensions as the original DataFrame
.
Some of the commands explained above have been executed in the code widget below:
using CSV, DataFramesx = " ------------------------------------";# Reading a CSV filedata = CSV.read("abc.csv", DataFrame)println(data)println(x)# Reading a CSV file,# setting the header argument to falsedata = CSV.read("abc.csv", DataFrame, header=false)println(data)println(x)# Reading a CSV file using the delim argumentdata = CSV.read("abc.csv", DataFrame, header=false, delim='\t')println(data)println(x)# Reading a CSV file using the select argumentdata = CSV.read("abc.csv", DataFrame, header=false, select=[1, 3])println(data)println(x)# Reading a CSV file using the missingstring argumentdata = CSV.read("abc.csv", DataFrame, header=false, missingstring="NA")println(data)println(x)# Convert the dataframe into an array# using the Matrix functionarray_data = Matrix(data)println(array_data[:,1:3])println(x)