A MapType interface is similar to dictionary in Python or HashMap in Java.
Itβs used to store key-value pairs. The key and value have a data type associated with them. The keys in a MapType are not allowed to be None or NULL.
MapType(keyType, valueType, valueContainsNull=True)
keyType: This is the data type of the keys.valueType: This is the data type of the values.valueContainsNull: This is a boolean value indicating whether the values can be NULL or None. The default value is True, which indicates that the values can be NULL.from pyspark.sql import SparkSessionfrom pyspark.sql.types import StructField, StructType, StringType, MapTypespark = SparkSession.builder.appName('answers').getOrCreate()dfSchema = StructType([StructField('Emp Name', StringType(), True),StructField('Details', MapType(StringType(),StringType()),True)])data = [('John Wick',{'country':'usa','profession':'Don'}),('Yash',{'country':'india','profession':'Artist'}),('Novak Djokovic',{'country':'serbia','profession':'tennis player'}),('Sundar Picchai',{'country':'usa','profession':'CEO'}),('Kobe Bryant',{'country':'usa','profession':'Basket ball player'})]df = spark.createDataFrame(data=data, schema = dfSchema)df.show(truncate=False)
SparkSession and relevant data types are imported.SparkSession with the application name answers is created.Details column is a MapType.Details column.Free Resources