How to use OpenMP for parallel code execution?

OpenMP is a C++/C API used to execute serial code in a parallel fashion using the core concept of multi-threading by adding compiler directives into our code.

Developers can use OpenMP to specify which structured block of their code should be executed in parallel and how many threads should be used in the parallel execution code block.

Note: OpenMP requires compiler support and hence it is possible that it may not run on some compilers.

How to use it?

OpenMP uses special instructions that are called pragmas which are essentially preprocessor directives. The keywords that follow the pragmas statement depend upon what extensions will be used.

The syntax of a pragma omp parallel directive that is used to execute code by multiple threads is as below:

# pragma omp parallel num_threads(thread_count)
/* structured block*/

Below is a breakdown of the above line of code:

  • thread_count: The number of threads to be used for parallel execution.

  • /* structured block */: The following serial code block will be converted to parallel.

  • num_threads: A clause that is used to initialize the threads.

Below is an example of an OpenMP pragma omp parallel clause on a print statement.

Note: To run the code written below in the terminal enter the command ./openmp .

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <omp.h>
#define NUM_THREADS 6
int main() {
// pragma clause to run a print statement in parallel
#pragma omp parallel num_threads(NUM_THREADS)
{
printf("Hello World\n");
}
// print a done statement before program termination
printf("All done\n");
exit(EXIT_SUCCESS);
}
Terminal 1
Terminal
Loading...

In the output window, Hello World is printed six times. This is because we had specified six threads to run the print statement. So each thread created its own copy of the code in the {} curly braces and then executed it.

Program flow diagram
Program flow diagram

Above we can see a flow diagram of the program that was executed. A master thread is a single thread used to run the program serially. When we come across a pragma omp parallel directive, the code inside the {} splits into the number of specified threads and is then executed by each thread.

OpenMP to run a function

Below we can see a simple C program that uses the pragma omp parallel directive to calculate the sum of the first 59 natural numbers.

To run the code below, open the terminal and enter the command ./loop

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <omp.h>
// define the number of threads
#define NUM_THREADS 6
// function that is to be ran using OpenMP
int thread_function(void) {
// get the current thread and the total number of threads
int my_number = omp_get_thread_num();
int totalThreads = omp_get_num_threads();
// calculate the starting number
int startAt = my_number*10;
int sum = 0;
// find the sum of the next 10 numbers from the start
for (int i=startAt; i <= (startAt+9); i++){
// add the sum to the threads local sum variable
sum += i;
}
// print a closing statement at the end of execution
printf("\n\tBye from %d\n", my_number);
// return the local sum
return sum;
}
int main() {
long totalSum = 0; // variable to store the total sum
#pragma omp parallel num_threads(NUM_THREADS)
{
long localSum =0; // variable to store the local sum of a thread
localSum = thread_function(); // get the local sum
totalSum += localSum; // add the local sum to the total sum
}
// print the total sum and exit
printf("\nTotal sum: %ld\n", totalSum);
exit(EXIT_SUCCESS);
}
Terminal 1
Terminal
Loading...

Code Explanation

  • Line 7: We define the number of threads we want to use.

  • Lines 10–28: The function to be used by multiple threads.

  • Line 12: We retrieve the total number of threads accessing the function by using the function omp_get_num_threads()

  • Line 13: We retrieve the current thread number executing the function by using the function omp_get_thread_num()

  • Line 16: We get the starting number of the counter. For thread 1, it is 10; for thread 2, it is 20, and so on.

  • Lines 20–23: After the starting number, we loop over the following 10 numbers for the thread and find the sum of the total 10 numbers.

  • Lines 24–27: When the loop has finished, we print a closing statement and then return the sum that a single thread has calculated.

  • Line 32: We call the pragma opm parallel directive on thread_function(), so six threads execute it.

  • Lines 35–36: We retrieve the sum value of each thread returned and add it to the overall global sum variable.

  • Lines 39–40: After all threads are finished, we print the total sum of all six threads and terminate the program.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved