What are the different levels in computer memory?

Memory is divided into four levels in a modern computer system. There is a physical portion of the memory that varies for different devices. A disk drive is a form of physical memory. Then comes the primary memory which consists of the RAM. The RAM is a volatile storage that stores data temporarily. The cache and registers exist inside the CPU as separate modules. The cache is used to reduce access time by storing data that is frequently used. A register is used to hold data such as instructions and addresses.

The diagram on the right shows these four levels of memory. The arrows represent the flow of data within them. Data can be transferred to adjacent levels only.

Four levels of memory

Similarly, data is written to the disk drive through the opposite path.

Difference between memory levels

The memory levels differ from each other by the following parameters:

Access speed means how much data is read or written to the media per unit of time. Its units of measurement are bytes per second (KBps).
Capacity is the maximum amount of data that a medium can store. Its units are bytes.
Cost is a price of a medium concerning its capacity. Its units are dollars or cents per byte or bit.
Access time is the time between when data was needed and when it became available to the processor. Its units are clock signals of the CPU.

Disk drives have the greatest capacity (several TBs), whereas a register has the smallest capacity (up to 1000 bytes).

Access speed of the cache is the fastest (from 700 to 100 GBps), whereas it is the slowest for disk drives (2000 Mbps).

The access time of registers is the smallest (1 tick) and the greatest for disk drives (up to 10000000 cycles).

Reading data from disk to register directly

It is impossible to read data from the disk to the registers directly despite the disk having a high speed of access. This is because the access speed is not so important. More critical is processor idle time.

Processor idle time refers to how long the processor idle waits for access to the requested data.

The unit of this idle time is the number of clock signals. The signal synchronizes all operations of the processor. It takes one or several clock cycles to execute a single program instruction.

Suppose that the processor reads the program instructions directly from the hard disk. In this case, the execution of the simplest algorithm would take weeks. Most of this time, the processor would idle while waiting for reading operations. Hierarchical organization of memory speeds up access to the data that the processor needs.

Memory levels in a program

Consider a simple program that reads a file from the disk drive and displays file contents on the screen. The memory level steps will be as follows:

Data is first read from the disk into the RAM.
Data is then loaded from the RAM to the CPU cache. The caching mechanism guesses which data the CPU would need next.
Next, the processor reads the needed data from the cache to registers.
The CPU then calls an API function of the system library.
The function receives data for printing on the screen.
The CPU provides the data to the function.
The system library refers to the video card driver. It displays the data on the screen.

What happens when the processor calls a function and passes data to it? If the data is stored in the cache, the CPU waits for about 2-100 cycles for the data. On the other hand, if the data is in the RAM, the wait time increases by order of magnitude (up to 1000 cycles).

Furthermore, now what if the file to be read has a huge size and cannot be stored in the RAM entirely? One way can be that the CPU can refer to the part that is not in the RAM. However, this causes the CPU idle time to be increased by four orders of magnitude (up to 10000000 clock cycles).

The cache can be used to store the data. However, not all data can be stored in the cache. When data that is not stored in the cache needs to be accessed, it results in a cache miss, which is expensive.

How to solve the problem

To come up with a solution to the issues discussed above, the following principle is used:

Memory devices with shorter access times are placed closer to the processor.

The internal memory of the CPU (registers and cache) is inside its chip.
The RAM is located on the motherboard next to the CPU.
There is the high-frequency data bus between CPU and RAM. A data bus carries data between the CPU and RAM.
The disk drive is connected to the motherboard via a relatively slow data bus such as SATA.
There is a system controller that uploads data from RAM to the CPU cache called northbridge. Modern systems have northbridge inside the processor chip.
The southbridge controller reads data from the hard drive into RAM.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources