A MapType
interface is similar to dictionary in Python or HashMap in Java.
It’s used to store key-value pairs. The key
and value
have a data type associated with them. The keys in a MapType
are not allowed to be None
or NULL
.
MapType(keyType, valueType, valueContainsNull=True)
keyType
: This is the data type of the keys.valueType
: This is the data type of the values.valueContainsNull
: This is a boolean value indicating whether the values can be NULL
or None
. The default value is True
, which indicates that the values can be NULL
.from pyspark.sql import SparkSessionfrom pyspark.sql.types import StructField, StructType, StringType, MapTypespark = SparkSession.builder.appName('answers').getOrCreate()dfSchema = StructType([StructField('Emp Name', StringType(), True),StructField('Details', MapType(StringType(),StringType()),True)])data = [('John Wick',{'country':'usa','profession':'Don'}),('Yash',{'country':'india','profession':'Artist'}),('Novak Djokovic',{'country':'serbia','profession':'tennis player'}),('Sundar Picchai',{'country':'usa','profession':'CEO'}),('Kobe Bryant',{'country':'usa','profession':'Basket ball player'})]df = spark.createDataFrame(data=data, schema = dfSchema)df.show(truncate=False)
SparkSession
and relevant data types are imported.SparkSession
with the application name answers
is created.Details
column is a MapType
.Details
column.Free Resources