Matlab, C++ and OpenMP

Continuing with the earlier post about Mex files and coding C++ routines for Matlab, I now leave a very nice trick.

The problem is the following one: you have a multicore computer with quite some memory; however, your problem is so big that you end up coding some of it in C or C++ and linking it in Matlab. This is bad, because that means you can no longer profit from Matlab's automatic parallelization (parfor and friends).

Could you achieve something equivalent then? Well, it turns out you can. The idea is that you are going to use a parallelizing compiler. This can be GCC, Intel or some other one, but the instructions below are specific of Linux and GCC.

Let's assume that you have a very very long loop over your spin configuration routine.

for (int k = 0; k < problem_size; ++k) {
   ...
}

The only thing that you have to do in the code is to instruct the compiler to split this computation into several independent workers.

#pragma omp parallel for
for (int k = 0; k < problem_size; ++k) {
   ...
}

Then, when you compile the code in Matlab, use special flags and libraries

mex CXXFLAGS="\$CXXFLAGS -march=native -Wall -fopenmp" LDFLAGS="\$LDFLAGS -lgomp" ... name_of_your_file.cc

Not all code can be parallelized or give a reasonable performance. In particular code that uses a lot of memory, such as the spin models I introduced before, have to be rewritten to avoid using the database concurrently, because concurrent access by multiple threads to the same memory can be counterproductive, specially when it does not fit the cache.

You can control the number of threads in two different ways: using the ompsetnumthreads() from libgomp, or setting the portable environment variable OMP_NUM_THREADS. If possible I recommend the later, which is more flexible and does not involve changing your code.