While working in Julia, especially in data science, we often need to load data in a way that our computer is able to understand for processing, analysis, and modeling.
In this shot, we’ll learn how to work with different file types in Julia, specifically
As most data may come in CSV, let’s have a look at various ways we can load data from a CSV file into Julia.
Queryverse
Queryverse
is one of the most versatile packages because it has various useful modules. Queryverse
is a useful package for manipulating, reshaping, querying any type of data in Julia.
We use Pkg
to import the Queryverse
package.
We enter the right side square bracket ]
on the command line.
This opens pkg
. Next, we enter the below commands:
julia>]
(@v1.07) pkg>
(@v1.07) pkg> add Queryverse
We load our data into a Dataframe
:
using Queryverse, DataFrames
df = DataFrame(load("mydata.csv"))
We can use the pipe operator as well:
using Queryverse, DataFrames
df = load("mydata.csv") |> DataFrame
In our jupyter notebooks, the above should apply, except we’ll need to import the modules first including Pkg
.
import Pkg
Pkg.add("Queryverse")
using Queryverse, DataFrames
df = DataFrame(load("mydata.csv"))
CSVFiles
and DataFrames
The other option we have is using CSVFiles
and DataFrames
packages in Julia.
The CSVFiles
package supports load
and save
functions of CSV in Julia.
Next, we enter the below commands:
julia> ]
(@v1.07) pkg> add CSVFiles
---download messages---
(@v1.07) pkg>
import Pkg
Pkg.add("CSVFiles")
To use the packages and load a CSV file, we enter the below commands:
using CSVFiles, DataFrames
df = DataFrame(load("mydata.csv"))
Queryverse
Queryverse
supports a lot of file types and not just CSV. These include excel files, feather files, stat file formats, and more.
To load a file in SPSS, we enter the below commands:
using Queryverse
df=DataFrame(load("mydata.sav"))
Alternatively, we use the pipe operator:
using Queryverse
df=load("mydata.sav") |> DataFrame
This loads the SPSS file into a Dataframe
.
We can use Queryverse
to save the different files created while working on our code.
using Queryverse
df = DataFrame(name=["Peter", "Emma"], salary=[1000,2000], department=["IT", "Data"])
df |> save("mydata.csv")
This will save df
into our local machine as a CSV file mydata
.
We can also use CSVFiles
to save in a CSV format.
Using CSVFiles
df = DataFrame(name=["Peter", "Emma"], salary=[1000,2000], department=["IT", "Data"])
df |> save("output.csv")
The load
and save
functions accept a number of arguments when loading or saving a CSV file. For example, saving your file, including delimiters using delim
, as well as including argument or headers using header
to specify if our data has headers.