MATLAB DCS: Difference between revisions

From Cheaha
Jump to navigation Jump to search
No edit summary
No edit summary
Line 116: Line 116:
Next, create a simple 2 task distributed Matlab script called "distrib.m" make sure to change:
Next, create a simple 2 task distributed Matlab script called "distrib.m" make sure to change:
  * email to your email address
  * email to your email address
  * time_limit to an appropriate soft runtime limit
  * s_rt to an appropriate soft wall time limit
  * hard_time_limit to the maximum wall time for your job
  * h_rt to the maximum wall time for your job
  * mem_free to the maximum memory needed for each task
  * mem_free to the maximum memory needed for each task
  * remote and local DataLocation to point to your working directory
  * outputDirectory to the directory where results should be stored


Don't make any changes to the section labeled "Configure the scheduler"
Don't make any changes to the section labeled "Configure the scheduler"
Line 125: Line 125:
% Always set these variables
% Always set these variables
email          = 'YOUR_EMAIL_ADDRESS';
email          = 'YOUR_EMAIL_ADDRESS';
time_limit      = '00:05:00';
s_rt            = '00:05:00';
hard_time_limit = '00:07:00';
h_rt            = '00:07:00';
mem_free        = '1G';
mem_free        = '1G';
clusterHost    = 'cheaha.uabgrid.uab.edu';
clusterHost    = 'cheaha.uabgrid.uab.edu';
scratch        = getenv('UABGRID_SCRATCH');
scratch        = getenv('UABGRID_SCRATCH');
remoteDataLocation = [scratch, '/jobs/matlab/distrib01'];
outputDirectory = [scratch, '/jobs/matlab/distrib01/output'];
localDataLocation = [scratch, '/jobs/matlab/distrib01/output'];
 


% Configure the scheduler
% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'      , localDataLocation);
set(sched, 'DataLocation'      , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'ClusterOsType'      , 'unix'            );
set(sched, 'ClusterOsType'      , 'unix'            );
set(sched, 'SubmitFcn', {@sgeSubmitFcn, remoteDataLocation, hard_time_limit, time_limit, mem_free, email});
set(sched, 'SubmitFcn', {@sgeSubmitFcn, h_rt, s_rt, mem_free, email});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
Line 189: Line 187:
Next, create a simple 4 slot parallel Matlab script called "parjob.m" make sure to change:
Next, create a simple 4 slot parallel Matlab script called "parjob.m" make sure to change:
  * email to your email address
  * email to your email address
  * time_limit to an appropriate soft runtime limit
  * s_rt to an appropriate soft wall time limit
  * hard_time_limit to the maximum wall time for your job
  * h_rt to the maximum wall time for your job
  * mem_free to the maximum memory needed for each task
  * mem_free to the maximum memory needed for each task
  * remote and local DataLocation to point to your working directory
  * outputDirectory to the directory where results should be stored


Don't make any changes to the section labeled "Configure the scheduler"
Don't make any changes to the section labeled "Configure the scheduler"
Line 198: Line 196:
% Always set these variables
% Always set these variables
email          = 'YOUR_EMAIL_ADDRESS';
email          = 'YOUR_EMAIL_ADDRESS';
time_limit      = '00:05:00';
s_rt            = '00:05:00';
hard_time_limit = '00:07:00';
h_rt            = '00:07:00';
mem_free        = '1G';
mem_free        = '2G';
clusterHost    = 'cheaha.uabgrid.uab.edu';
clusterHost    = 'cheaha.uabgrid.uab.edu';
scratch        = getenv('UABGRID_SCRATCH');
scratch        = getenv('UABGRID_SCRATCH');
remoteDataLocation = [scratch, '/jobs/matlab/parallel01'];
outputDirectory = [scratch, '/jobs/matlab/parallel01/output'];
localDataLocation  = [scratch, '/jobs/matlab/parallel01/output'];
 


% Configure the scheduler
% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'      , localDataLocation);
set(sched, 'DataLocation'      , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'ClusterOsType'      , 'unix'            );
set(sched, 'ClusterOsType'      , 'unix'            );
set(sched, 'ParallelSubmitFcn', {@sgeParallelSubmitFcn, remoteDataLocation, hard_time_limit, time_limit, mem_free, email});
set(sched, 'ParallelSubmitFcn', {@sgeParallelSubmitFcn, h_rt, s_rt, mem_free, email});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
get(sched)
get(sched)
pjob = createParallelJob(sched);
pjob = createParallelJob(sched);
% start of user specific commands
% start of user specific commands
createTask(pjob, 'rand', 1, {4});
createTask(pjob, 'rand', 1, {4});
Line 224: Line 221:


submit(pjob)
submit(pjob)
waitForState(pjob)
results = getAllOutputArguments(pjob)
celldisp(results)
</pre>
</pre>



Revision as of 22:23, 24 March 2010

Steps to run Matlab

Cheaha now has a 128 node license for the Distributed Computing Server component of Matlab.

In order to use DCS on Cheaha, you will have to use your own Matlab and Parallel Computing Toolbox license.

Matlab Versions

Use the 'module' command to view a list of available Matlab versions. If the version that you require isn't listed, please open a help desk ticket to request the installation.

The following is an example output of the command and doesn't necessarily represent the currently installed versions:

$ module avail mathworks

------------------------------------ /etc/modulefiles -------------------------------------
mathworks/R2009b
mathworks/R2009a

Simple Matlab Test

A simple test to verify that the Matlab client on Cheaha can check out a license from your server.

Set up your environment with the command:

$ module load mathworks/R2009b

As a test, you can run MatLab and access your license server with

$ matlab -c port@license-server -nodesktop -nojvm -r "rand, exit"

For example:

$ module load mathworks/R2009b
$ matlab -c 27000@licserver.uab.edu -nodesktop -nojvm -r "rand, exit"

                        < M A T L A B (R) >
                Copyright 1984-2009 The MathWorks, Inc.
              Version 7.9.0.529 (R2009b) 64-bit (glnxa64)
                          August 12, 2009
 
  To get started, type one of these: helpwin, helpdesk, or demo.
  For product information, visit www.mathworks.com.

ans =

    0.8147

This will start matlab without a graphical display and without Java support. This is good just to verify things work, but do not run any significant computations on the Cheaha head node!

MatLab computational work must be run on the compute nodes by submitting a job submission script to the SGE scheduler

Serial Matlab

Serial Matlab jobs have the following characteristics:

* Consumes one of your client licenses for the duration of the job
* Does not use the distributed licenses available on cheaha
* Does not require the parallel computing toolbox
* Restricted to a single CPU core (slot)

See the next section for an example using the distributed computing license.

Create a job script "matlabtest.qsub" making sure to change:

* YOUR_EMAIL_ADDRESS
* h_rt and s_rt to appropriate hard and soft runtime limits
* h_vmem to the maximum amount of memory that your job will use
$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/serial01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/serial01
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#
#$ -N serialMatlab
#$ -l h_rt=00:10:00,s_rt=00:08:00,h_vmem=2G
#$ -j y
#
#$ -M YOUR_EMAIL_ADDRESS
#$ -m eas
#
module load mathworks/R2009b
#$ -V
 
matlab -c port@license-server -nodisplay -nojvm < matlab-script

Then submit the script to the scheduler with

$ cd $UABGRID_SCRATCH/jobs/matlab/serial01
$ qsub matlabtest.qsub

Check on it with qstat.

$ qstat -u $USER

Distributed Matlab

These instructions provide an example of how to create and submit a distributed Matlab job on cheaha.

Distributed Matlab jobs use the following licenses:

  • Your own Matlab client license with the Parallel Computing Toolbox
  • The Cheaha Distributed Computing license

The client license will only be needed for as long as it takes Matlab to start the job on the compute nodes (unless you keep the client open, for example using "waitForState(job)" in your Matlab script).

The instructions are a work in progress, so please contact Research Computing support with any questions or corrections.

First, create the working directory for the job

$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/distrib01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/distrib01

Next, create a simple 2 task distributed Matlab script called "distrib.m" make sure to change:

* email to your email address
* s_rt to an appropriate soft wall time limit
* h_rt to the maximum wall time for your job
* mem_free to the maximum memory needed for each task
* outputDirectory to the directory where results should be stored

Don't make any changes to the section labeled "Configure the scheduler"

% Always set these variables
email           = 'YOUR_EMAIL_ADDRESS';
s_rt            = '00:05:00';
h_rt            = '00:07:00';
mem_free        = '1G';
clusterHost     = 'cheaha.uabgrid.uab.edu';
scratch         = getenv('UABGRID_SCRATCH');
outputDirectory  = [scratch, '/jobs/matlab/distrib01/output'];

% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'       , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'ClusterOsType'      , 'unix'             );
set(sched, 'SubmitFcn', {@sgeSubmitFcn, h_rt, s_rt, mem_free, email});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
get(sched)
job = createJob(sched);

% start of user specific commands
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});

submit(job)

Running the Matlab script will submit 2 SGE single slot (CPU) jobs, one for each task. The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!

$ module load mathworks/R2009b
$ matlab -c port@license-server -nodisplay < distrib.m

Check qstat to see that the scheduler now has 2 jobs running, one for each task

$ qstat -u $USER

job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 110839 0.50167 Job1.1     jdoe         r     03/10/2010 16:32:37 all.q@compute-0-12.local           1
 110840 0.50083 Job1.2     jdoe         r     03/10/2010 16:32:37 all.q@compute-0-12.local           1

The job output can be found in the "output" directory

Parallel Matlab

These instructions provide an example of how to create and submit a parallel Matlab job on cheaha. Parallel Matlab jobs require two separate licenses:

* Your own client license that includes the Parallel Computing Toolbox
* The Cheaha 128 node Distributed Computing license

The client license will only be needed for as long as it takes Matlab to start the job on the compute nodes (unless you keep the client open, for example using "waitForState(job)" in your Matlab script).

Check out this Matlab Help Page for a quick overview of using parallel code in your Matlab scripts.

First, create the working directory for the job

$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/paralle01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/parallel01

Next, create a simple 4 slot parallel Matlab script called "parjob.m" make sure to change:

* email to your email address
* s_rt to an appropriate soft wall time limit
* h_rt to the maximum wall time for your job
* mem_free to the maximum memory needed for each task
* outputDirectory to the directory where results should be stored

Don't make any changes to the section labeled "Configure the scheduler"

% Always set these variables
email           = 'YOUR_EMAIL_ADDRESS';
s_rt            = '00:05:00';
h_rt            = '00:07:00';
mem_free        = '2G';
clusterHost     = 'cheaha.uabgrid.uab.edu';
scratch         = getenv('UABGRID_SCRATCH');
outputDirectory = [scratch, '/jobs/matlab/parallel01/output'];

% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'       , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'ClusterOsType'      , 'unix'             );
set(sched, 'ParallelSubmitFcn', {@sgeParallelSubmitFcn, h_rt, s_rt, mem_free, email});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
get(sched)
pjob = createParallelJob(sched);

% start of user specific commands
createTask(pjob, 'rand', 1, {4});
set(pjob, 'MinimumNumberOfWorkers', 4);
set(pjob, 'MaximumNumberOfWorkers', 4);

submit(pjob)

Running the Matlab script will submit 1 SGE job requesting 4 slots (cpu cores). The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!

$ module load mathworks/R2009b
$ matlab -c port@license-server -nodisplay < parjob.m

Check qstat to see that the scheduler now has 2 jobs running, one for each task

$ qstat -u $USER

job-ID  prior   name	   user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 110857 0.00000 Job1	   jdoe         r     03/10/2010 17:20:08                                    4

The job output can be found in the "output" directory

ParFor Parallel Example

This example will utilize the parfor parallel loop as defined here.