What is deadlock detection in distributed systems?

Distributed deadlocks can develop in distributed systems when distributed transactions or concurrency control are used. It is found via a distributed approach such as edge chasing or constructing a global wait-for graphIt is a directed graph used for deadlock detection in operating systems and relational database systems. (WFG) from local wait-for graphs at a deadlock detector. Phantom deadlocksIt is a deadlock that occurs in a distributed DBMS due to communication delays between different processes and leads to unnecessary process abortions. are discovered in a distributed system but do not occur owing to internal system delays.

As a distributed system is so large, deadlock cannot be prevented or avoided. As a result, only deadlock detection is feasible. The following are necessary for distributed system deadlock detection techniques:

Progress: The technique should be able to detect all of the system's deadlocks.
Safety: The procedure should not identify phantom or fake deadlocks.

Methods to detect deadlock

In distributed systems, there are three methods to detect deadlocks:

Centralized approach: One resource is responsible for detecting deadlock. This technique is beneficial because it is simple and quick to implement. Still, the disadvantages include a high burden at one node, single-point failureThe entire system is dependent on one node, and if that node fails, the whole system crashes, and reduced reliability.
Distributed approach: It involves several nodes working together to detect deadlocks. There is no one point of failure since the burden is evenly distributed across all nodes. The discovery of deadlocks becomes faster as well.
Hierarchical method: It is the most advantageous method. In a distributed system, it integrates both centralized and distributed techniques for deadlock detection. In this technique, some nodes or clusters of nodes are in charge of deadlock detection, which is managed by a single node.

Algorithms for deadlock detection

The following Knapps' classification algorithms can be used to request resources in distributed systems in a variety of ways:

Path pushing algorithm

The main idea is to generate a global WFG (wait-for graph) for each distributed system site:

When an algorithm class conducts a deadlock calculation, it broadcasts its local WFG to all surrounding areas.
When it receives information from its neighbors, it does the exact computation and returns the resulting local WFGs.
This process repeats until all locations have received their local WFGs.

Edge changing algorithms

This is a straightforward approach for detecting the existence of cycles in a distributed network. Messages are classified into two types:

Request
Reply

A probe message is distinct from a recommendation or a reply message. Probe messages are limited in length and size. A request message is sent from one site to another to access a requested resource, and the responding site sends a reply message.

If the graph contains a cycle, the first site will get the matching reply message from the second site through another channel (the second site is not directly connected to the first site). This is due to the presence of a cycle.

Diffusion computations-based algorithms

The detection problem can be handled using echo algorithms, which work by "echoing" or "echoing back" a message received from another process. The basic concept is to launch an echo query and then wait for the response.

Global state detection-based algorithms

The following steps are used by deadlock detection methods to detect the stalemate:

If one process has been waiting for another, it is stated that the first has a claim on the second.
A cycle is formed when numerous processes have claims on one other. As a result, they are mutually blocked.
This method waits until it detects a deadlock. At that point, it takes a snapshot of all processes.

Deadlock prevention strategies

Handling deadlock becomes increasingly complex in distributed systems since neither site has complete knowledge of the system's current state and each inter-site connection involves a finite and unpredictable delay.

To decide if the system is safe or dangerous, the operating system employs the deadlock avoidance approach.
The process must notify the operating system of the maximum resources available, and the process may request that its execution be completed.
Deadlocks are avoided by developing a process that acquires all of the necessary resources at the same time before beginning execution, moreover, also by preempting a process that already possesses the resource.
To detect deadlock, the presence of cyclical waitIt is a method of synchronization in which a process waits and constantly checks for a condition to be satisfied before proceeding with its execution. initiates an analysis of the status of process resource interactions.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources