Four types of failures that can occur in distributed systems

A distributed system is a collection of independent computers that appears to its users as a single coherent system. A distributed system can have several types of failures. The four basic types are illustrated in the figure below.

Types of failure

1. System failure

The primary reason for a system failure is a software or hardware failure. It is a suitable assumption that a system failure always results in the loss of the contents of the primary memory, while the secondary storage remains safe. Whenever there is a system failure, the processor fails to perform the execution. During a system failure, the system may reboot or freeze.

2. Communication medium failure

The main reason for a communication medium failure is the failure of the communication link or the shifting of nodes. A possible scenario can be a website within a network that is trying to communicate with another operational website within the network, but is unable to establish the connection.

3. Secondary storage failure

If the information stored on the secondary storage device is inaccessible, it is called a secondary storage failure. Secondary storage failure can be caused by several reasons. Some of the common reasons are listed as follows:

  • Node crash
  • Dirt on the medium
  • Parity error

4. Method failure

A method failure most often halts the distributed system. Moreover, it makes the system result in incorrect execution, or unable to perform any execution at all. A system may enter a deadlock state or do protection violations during a method failure.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved