The df.write.csv()
method is used to write a DataFrame to a CSV file. Various different options related to the write operation can be specified via the df.write.option()
method.
df.write.option("option_name", "option_value").csv(file_path)
file_path
: Denotes the path where the csv file to be created.import pyspark, os from pyspark.sql import SparkSession spark = SparkSession.builder.appName('answer').getOrCreate() data = [("James","Educative","Engg","USA"), ("Michael","Google",None,"Asia"), ("Robert",None,"Marketing","Russia"), ("Maria","Netflix","Finance","Ukraine"), (None, None, None, None) ] columns = ["emp name","company","department","country"] df = spark.createDataFrame(data = data, schema = columns) csv_file_path = "data.csv" df.write.option("header", True).option("delimiter",",").csv(csv_file_path)
Follow the instructions mentioned below to inspect the generated CSV file.
ls
command to view the data.csv
directory.cd data.csv
command to view the generated .csv
file.ls
command to view the generated .csv
file.cat
command.cat *.csv
syntax. The *
sign denotes the filename with a .csv
extension. We may copy and paste the filename here.pyspark
DataFrame and SparkSession
is imported.SparkSession
with the application name answer
.write.csv()
function on the DataFrame object.Free Resources