A data race occurs in concurrent programs when multiple threads or processes access a shared variable, and at least one of them modifies it without proper synchronization. This can lead to unpredictable results because the operations may interleave in various ways. For example, in a banking application, a data race could lead to incorrect account balances, causing financial discrepancies.
Let's examine an example of a C++ program below with a data race condition.
In the following C++ program, three threads (th1
, th2
, and th3
) access and increment the global variable num
by 20, creating a data race condition.
#include <iostream>#include <vector>#include <thread>using namespace std;int num = 0;void increment() {// Add a delay to increase the likelihood of a data racethis_thread::sleep_for(chrono::milliseconds(200));num = num + 20;}int main() {thread th1(increment);thread th2(increment);thread th3(increment);th1.join();th2.join();th3.join();cout << "Final value: " << num << endl;return 0;}
Here's the explanation of the above code.
Line 7: We define a global variable num
initialized to 0.
Line 9: We define a increment()
function that increments num
by 20.
Lines 16–18: We create three threads (th1
, th2
, and th3
) that execute the increment()
function.
Lines 20–22: Use the join()
method to ensure that the main thread waits for th1
, th2
, and th3
to complete their execution.
Line 24: Print the final value of num
.
Each thread that modifies num
does two main things:
Access the value of num
and increment it by 20.
Write the incremented value back to num
.
A data race occurs when one thread increments the value but before it can write the incremented value back, control passes to another thread, which reads the old value and increments it. For example, if th1
reads num
as 0 and increments it to 20, but before writing back 20, control switches to th2
, which also reads num
as 0 and increments it to 20, leading to incorrect results.
Run the above code multiple times. You may observe different outputs due to the data race. Here are some possible outputs from different runs:
Run 1: Final value: 60
Run 2: Final value: 40
Run 3: Final value: 20
These varying outputs occur because the threads are interleaved differently each time the program runs.
One very easy way to prevent a data race in our programs is through the use of mutexes. Mutex are locks acquired by processes when they need to modify shared data. When a mutex is acquired by a process, it ensures the other processes that may need to access that same data would need to wait until the lock is released by the process that acquired it. In this manner, it is ensured that no process is interrupting a write process and causing a data race.
We can modify our program above to include the use of a mutex:
#include <iostream>#include <vector>#include <thread>#include <mutex>using namespace std;int num = 0;mutex num_mutex;void increment() {lock_guard <mutex> lock(num_mutex);num = num + 20;}int main() {thread th1(increment);thread th2(increment);thread th3(increment);th1.join();th2.join();th3.join();cout << "Final value: " << num << endl;return 0;}
Here's the explanation of the above code.
Line 4: We include the mutex
library, which allows us to use mutexes.
Line 9: We define a mutex num_mutex
which we utilise to address the data race in our program.
Line 12: Use lock_guard
to lock the mutex before modifying num
. The mutex is automatically released when the lock goes out of scope, ensuring safe modification of num
.
Run the modified code with the mutex multiple times. You will observe consistent output:
Run 1: Final value: 60
Run 2: Final value: 60
Run 3: Final value: 60
By using a mutex, we ensure that only one thread can increment and write back the value of num
at a time, preventing the data race and ensuring the correct output.
Free Resources