Memory Management in Spark 3

svg viewer

Spark configuration

Spark allows you to configure a system according to your needs. One of the locations to configure a system is Spark Properties. Spark Properties control most application parameters and can be set using a SparkConf object. One sub-domain of these properties is Memory Management.

Memory management

Since version 1.6, Spark has been using the Unified Memory Manager. The Unified Memory Manager allows the Storage Memory and Execution Memory to co-exist and share each other’s free space. This memory management model is based on JVM and has two types:

  1. On-Heap Memory
  2. Off-Heap Memory

On-Heap Memory

On-Heap Memory has four components, as illustrated on the right:

  1. Storage Memory
  2. Execution Memory
  3. User Memory
  4. Reserved Memory
On-Heap Memory Model
On-Heap Memory Model

Storage memory

Storage Memory stores Spark cache data, broadcast variable, and Unroll data.

Execution Memory

Execution Memory stores temporary objects during the execution of Spark tasks such as sort, aggregate, etc.

User Memory

User Memory stores your data that is needed for RDD conversion operations(e.g., the information for RDD dependency).

Reserved Memory

Reserved Memory is reserved for the system and is used to store Spark’s internal objects. Its size is hardcoded.

Off-Heap Memory

Off-Heap Memory has two components, as illustrated on the right:

  • Storage Memory
  • Execution Memory

They are used for the same purpose described above. Off-heap memory is disabled by default, but we can enable it with the spark.memory.offHeap.enabled parameter and set the memory size with the spark.memory.offHeap.size parameter.

Off-Heap Memory Model
Off-Heap Memory Model

Here is a list of different properties that can be used to configure Spark.

Free Resources

HowDev By Educative. Copyright ©2025 Educative, Inc. All rights reserved