How do I use Matlab's Distributed Computing features?

Matlab Distributed Computing Server (MDCS) is installed for Matlab versions R2014a, R2015a, R2015b, R2016a, R2017a, R2018a, R2018b and R2019a.

NOTE: the commands used to submit jobs has changed with R2017a. This page describes the commands for R2017a and later. See this guide for earlier versions.  

Using MATLAB on the cluster enables you to utilize high performance facilities like:

  • Parallel computing
    • Parallel for-loops
    • Evaluate functions in the background
  • Big data processing
    • Analyze big data sets in parallel
  • Batch Processing
    • Offload execution of functions to run in the background
  • GPU computing
    • Accelerate your code by running it on a GPU
  • Machine & deep learning

For a complete user guide from MATHWORKS, see
https://se.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav

Get started

With MATLAB you can e.g. submit jobs directly to our job queue scheduler, without having to use slurm's commands directly. To do this login to the cluster's login node (e.g. rackham.uppmax.uu.se) with SSH (X-forwarding enabled), or perhaps more efficiently with Thinlinc (see our Thinlinc guide). 

Please use interactive mode to avoid unexpected overtrassing!

You load the matlab module with e.g.:

$ module load matlab/R2019a  


and launch the GUI with just : 

$ matlab &

use & to have MATLAB in background

Begin by running the command "configCluster" in Matlab Command Window to choose a cluster configuration. Matlab will set up a configuration and will then print out some instructions. Follow them. These inform you what is needed in your script or in command line to run in parallel on the cluster. 
You can also set environments that is read if you don't specify it. Go to HOME > ENVIRONMENT > Parallel > Parallel preferences. 

A simple test case that can be run is the following:

>> configCluster
   [1] rackham
   [2] snowy
Select a cluster [1-2]: 1
>> 
>> c = parcluster('rackham');
>> c.AdditionalProperties.AccountName = 'snic2019-1-234';
>> c.AdditionalProperties.QueueName = 'node';
>> c.AdditionalProperties.WallTime = '00:10:00';
>> c.saveProfile
>> job = c.batch(@parallel_example, 1, {90, 5}, 'pool', 19)
>> job.wait
>> job.fetchOutputs{:}

where parallel_example.m is a file with the following matlab function:

function t = parallel_example(nLoopIters, sleepTime) 
  t0 = tic; 
  parfor idx = 1:nLoopIters 
    A(idx) = idx; 
    pause(sleepTime); 
  end 
  t = toc(t0); 

This will schedule a 20 tasks node-job (19 + 1) on Rackham under the given project (so you'll have to change this to your project name). For the moment jobs are hard coded to be node jobs. This means that if you request 21 tasks instead (20 + 1) you will get a 2 node job, but only 1 core will be used on the second node. In this case you'd obviously request 40 tasks (39 + 1) instead.

The second argument in the call to c.batch(), 1 in this example, is the number of output arguments expected from the function to be called. Function that returns no arguments needs a 0 here instead.

The curly brackets {90, 5} in the example contain the input arguments for the function to be called, in this example nLoopIters=90 and sleepTime=5.

For jobs using several nodes (in this case 2) you may modify the call to:

>> configCluster
   [1] rackham
   [2] snowy
Select a cluster [1-2]: 1
>> 
>> c = parcluster('rackham');
>> c.AdditionalProperties.AccountName = 'snic2019-1-234';
>> c.AdditionalProperties.QueueName = 'node';
>> c.AdditionalProperties.WallTime = '00:10:00';
>> c.saveProfile
>> job = c.batch(@parallel_example_hvy, 1, {1000, 1000000}, 'pool', 39)
>> job.wait
>> job.fetchOutputs{:}

where parallel_example-hvy.m is a file with the following matlab function:

function cmdout = parallel_example_hvy(nLoopIters, sleepTime)
  t0 = tic;
  ml = 'module list';
  [status, cmdout] = system(ml);
  parfor idx = 1:nLoopIters
    A(idx) = idx;
    
    for foo = 1:nLoopIters*sleepTime
        A(idx) = A(idx) + A(idx);
        A(idx) = A(idx)/3;
    end
end

To see the output to screen from jobs, use job.Tasks.Diary. Output from the submitted function is fetched with fetchOutputs().

For more information about Matlab's Distributed Computing features please see Matlab's HPC Portal.