How to use OpenMP for parallel code execution?

OpenMP is a C++/C API used to execute serial code in a parallel fashion using the core concept of multi-threading by adding compiler directives into our code.

Developers can use OpenMP to specify which structured block of their code should be executed in parallel and how many threads should be used in the parallel execution code block.

Note: OpenMP requires compiler support and hence it is possible that it may not run on some compilers.

How to use it?

OpenMP uses special instructions that are called pragmas which are essentially preprocessor directives. The keywords that follow the pragmas statement depend upon what extensions will be used.

The syntax of a pragma omp parallel directive that is used to execute code by multiple threads is as below:

Above we can see a flow diagram of the program that was executed. A master thread is a single thread used to run the program serially. When we come across a pragma omp parallel directive, the code inside the {} splits into the number of specified threads and is then executed by each thread.

OpenMP to run a function

Below we can see a simple C program that uses the pragma omp parallel directive to calculate the sum of the first 59 natural numbers.

To run the code below, open the terminal and enter the command ./loop

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <omp.h>
// define the number of threads
#define NUM_THREADS 6
// function that is to be ran using OpenMP
int thread_function(void) {
    // get the current thread and the total number of threads
    int my_number = omp_get_thread_num();
    int totalThreads = omp_get_num_threads();
    // calculate the starting number
    int startAt = my_number*10;
    int sum = 0;
    // find the sum of the next 10 numbers from the start
    for (int i=startAt; i <= (startAt+9); i++){
        // add the sum to the threads local sum variable
        sum += i;
    }
    // print a closing statement at the end of execution
    printf("\n\tBye from %d\n", my_number);
    // return the local sum
    return sum;
}
int main() {
    long totalSum = 0;  // variable to store the total sum
    #pragma omp parallel num_threads(NUM_THREADS)
    {
        long localSum =0;   // variable to store the local sum of a thread
        localSum = thread_function();   // get the local sum
        totalSum += localSum;   // add the local sum to the total sum
    }
    // print the total sum and exit
    printf("\nTotal sum: %ld\n", totalSum);
    exit(EXIT_SUCCESS);
}

Code Explanation

Line 7: We define the number of threads we want to use.
Lines 10–28: The function to be used by multiple threads.
Line 12: We retrieve the total number of threads accessing the function by using the function omp_get_num_threads()
Line 13: We retrieve the current thread number executing the function by using the function omp_get_thread_num()
Line 16: We get the starting number of the counter. For thread 1, it is 10; for thread 2, it is 20, and so on.
Lines 20–23: After the starting number, we loop over the following 10 numbers for the thread and find the sum of the total 10 numbers.
Lines 24–27: When the loop has finished, we print a closing statement and then return the sum that a single thread has calculated.
Line 32: We call the pragma opm parallel directive on thread_function(), so six threads execute it.
Lines 35–36: We retrieve the sum value of each thread returned and add it to the overall global sum variable.
Lines 39–40: After all threads are finished, we print the total sum of all six threads and terminate the program.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources