MATLAB DCS

From UABgrid Documentation
(Difference between revisions)
Jump to: navigation, search
(Serial Matlab)
(Distributed Matlab)
Line 321: Line 321:
 
</pre>
 
</pre>
  
Next, create a simple 2 task distributed Matlab script called "distrib.m" make sure to change:
+
Next, create a simple 2 task distributed Matlab script called "sharedDistrib01.m" make sure to change:
 
  * email to your email address
 
  * email to your email address
 
  * s_rt to an appropriate soft wall time limit
 
  * s_rt to an appropriate soft wall time limit
Line 331: Line 331:
 
<pre>
 
<pre>
 
% Always set these variables
 
% Always set these variables
email          = 'YOUR_EMAIL_ADDRESS';
+
matlab_ver      = 'R2010a';    % (Matlab release supported by your license) R2009a R2009b R2010a
s_rt            = '00:05:00';
+
email          = 'YOUREMAIL'; % your email address
h_rt            = '00:07:00';
+
email_opt      = 'eas';      % qsub email options
mem_free        = '1G';
+
s_rt            = '00:05:00'; % soft wall time
clusterHost    = 'cheaha.uabgrid.uab.edu';
+
h_rt            = '00:07:00'; % hard wall time
 +
mem_free        = '1G';       % Amount of memory need per task
 
scratch        = getenv('UABGRID_SCRATCH');
 
scratch        = getenv('UABGRID_SCRATCH');
 
outputDirectory  = [scratch, '/jobs/matlab/distrib01/output'];
 
outputDirectory  = [scratch, '/jobs/matlab/distrib01/output'];
Line 342: Line 343:
 
sched = findResource('scheduler', 'type', 'generic');
 
sched = findResource('scheduler', 'type', 'generic');
 
set(sched, 'DataLocation'      , outputDirectory);
 
set(sched, 'DataLocation'      , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
+
set(sched, 'ClusterMatlabRoot', ['/share/apps/mathworks/', matlab_ver]);
set(sched, 'HasSharedFilesystem',  true             );
+
set(sched, 'HasSharedFilesystem',  true);
set(sched, 'ClusterOsType'      , 'unix'             );
+
set(sched, 'ClusterOsType'      , 'unix');
set(sched, 'SubmitFcn', {@sgeSubmitFcn, h_rt, s_rt, mem_free, email});
+
set(sched, 'SubmitFcn', {@sgeSubmitFcn});
 
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
 
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
 
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
 
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
 +
sge_options = ['-pe matlab 1 -l matlab_dcs=1,mem_free=', mem_free, ',h_rt=', h_rt, ',s_rt=', s_rt, ' -m ', email_opt, ' -M ', email];
 
get(sched)
 
get(sched)
 
job = createJob(sched);
 
job = createJob(sched);
Line 360: Line 362:
 
Running the Matlab script will submit 2 SGE single slot (CPU) jobs, one for each task. The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!
 
Running the Matlab script will submit 2 SGE single slot (CPU) jobs, one for each task. The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!
 
<pre>
 
<pre>
$ module load mathworks/R2009b
+
$ module load mathworks/R2010a
$ matlab -c port@license-server -nodisplay < distrib.m
+
$ matlab -c port@license-server -nodisplay < sharedDistrib01.m
 
</pre>
 
</pre>
  

Revision as of 11:25, 5 May 2010

Contents

Cheaha now has a 128 node license for the Distributed Computing Server component of Matlab.

In order to use DCS on Cheaha, you will have to use your own Matlab and Parallel Computing Toolbox license.

Overview

The following is an overview of the process to run Matlab jobs on cheaha using the Distributed Computing Server:

  • Obtain your own Matlab client license
  • Obtain your own Parallel Computing Toolbox license
  • Install and license both Matlab client and Parallel Computing Toolbox on your Windows / Linux / Mac workstation
  • Configure SSH on your workstation (see the section on 'SSH Keys' below)
  • Download and extract the Matlab submission functions to your Matlab PATH
  • Add Cheaha to the Parallel Computing Toolbox configuration
  • Write, test and debug your parallel code on your local workstation
  • Once the code is ready for production, run it on Cheaha using the Parallel Computing Toolbox
  • After the job completes (you will receive an email), retrieve the results
  • Destroy the job related content on cheaha to clean up

SSH Keys

The Matlab Parallel Toolbox uses SSH authentication via public-private key-pair to connect to the Matlab Distributed Computing Server on the cluster head node.

The process of configuring SSH keys differs depending on your client operating system. Linux and Mac should already have the appropriate SSH client software installed, Windows will require the installation of PuTTY.

Windows

This section documents the steps to install and configure PuTTY on Windows computers. This software provides the utilities that Matlab uses to communicate with the head node.

PuTTY

If PuTTY is installed, skip to the next section "Generate an SSH Key Pair".

If PuTTY isn't already installed on your system, download the tools from the PuTTY Downloads page.

Download and run this file to install PuTTY using the graphical installation tool.

Alternatively, you can download the individual components to a directory of your choosing. This install approach does not require Windows Administrator privileges:

Using the individual component install approach requires that you add the install directory to your PATH. See this Microsoft Support page for details on altering the PATH environment variable.

Generate an SSH Key Pair

Generate a public-private key pair by running the puttygen command.

Start PuTTYgen by either:

  • Clicking Start -> All Programs -> PuTTY --> PuTTYgen
  • Opening Windows Explorer / My Computer, browse to the PuTTY directory and double click 'puttygen'

This will bring up a window to manage your key pair.

Puttygen-window.png

Press the "Generate" button to start the process. You will be requested to move your mouse around in the blank are of that window to help generate a good random number. The progress bar will fill in as you do so letting you know when the process is complete. Once complete, the PuTTY Gen window will display your public key and offer you various options to work with that key.

Puttygen-savekey2.png

  1. Change the "Key Comment" to your Windows host name (run 'hostname' at the command prompt to discover the name of your Windows system)
  2. Set a passphrase for your private key, filling in both the "Key passphrase" and "Confirm passphrase" text boxes with the same passphrase. The passphrase is a local password for this private key. It doesn't have anything to do with any other passwords. It is strictly about protecting the private key that you just generated. Please refer to the UAB IT page for instructions on creating a strong password. You need to remember this passphrase because you will be prompted for it whenever you use this key-pair.
  3. Press the "Save private key" button. The save button, will by default save the private key to your "My Documents" folder. This is fine, and you can give any file name you like. Just remember the name and where you saved it, so you can load it in the next steps.
  4. Keep your PuTTY Key Generator window open so we can use below when we register you public key with the SSH server.
  5. Remember your passphrase.
Create a Session Definition

The PuTTY tools that Matlab Parallel Toolbox leverages are configured by creating "Saved Sessions" PuTTY.

To create a PuTTY session for cheaha.uabgrid.uab.edu, follow these steps.

Start PuTTY by either:

  • Clicking Start -> All Programs -> PuTTY --> PuTTY
  • Opening Windows Explorer / My Computer, browse to the PuTTY directory and double click 'putty'
  1. This brings up the following dialog that has a collection of configuration categories in the "Category:" window on the left, and a context sensitive set of actions on the right. That is, if you change the category on the left you will change what you see on the right. The default category that you see is the Session, which is the main focus of our work here.
    Putty-session-dialog.png
  2. In the default "Session" category, fill in the "Host Name (or IP address)" text box with "cheaha.uabgrid.uab.edu", and fill in the "Saved Sessions" text box with "cheaha.uabgrid.uab.edu". Press the "Save" button
    Putty-session-dialog-filled-cheaha.png
  3. Select the "Data" sub-category under the "Connection" category from the left-side Category browser. Fill in the "Auto-login username" text box with the username for your account on cheaha.
  4. Select the "Session" category, and press "Save" again. You now have a saved session ready for use.
  5. Press the "Open" button at the bottom on the dialog. This will bring up a login window with a login prompt for your password. Provide your cheaha password to login. Keep this session open, as you will use it in the next section.
Register Your Public Key

In order to use your public-private key-pair to start an SSH session, you need to register you public key with cheaha by adding your public key to the list of authorized keys for that SSH server.

  1. Use the SSH connection established in the previous step
  2. In the PuTTYgen window text area labeled "Public key for pasting into OpeSSH authorized_keys file:", select the public key by right clicking the key and clicking "Select All". Copy the key by right clicking again and selecting "Copy"
  3. In the SSH session window, enter the command
    vi $HOME/.ssh/authorized_keys
  4. Press the SHIFT and o keys (in other words, a capital letter O), and the press your right mouse button over this window. This will paste your public key onto a new line. Press the Escape (Esc) key, press the key sequence colon-w-q-enter (:wq Enter)
  5. Fix the file permissions using the command
    chmod u=rw $HOME/.ssh/authorized_keys
  6. End the session. Type the command
    exit

You can close the PuTTY Configuration and PuTTY Key Generator windows now if you haven't already done so.

Load Your Private Key

Loading your private key is the first step you will take prior to using Matlab to submit jobs to cheaha. This step essentially activates your key so that all PuTTY-based tools can use it in their operations.

Start PuTTYagent by either:

  • Clicking Start -> All Programs -> PuTTY --> PuTTYagent
  • Opening Windows Explorer / My Computer, browse to the PuTTY directory and double click 'pagent'

This will place the "Pageant" icon (a computer wearing a tilted hat: Pageant-icon.png) in you shortcuts toolbar in the lower-right of our screen.

  1. Click on the icon to bring up the Pageant Key List dialog
    Pageant-dialog.png
  2. Press the "Add Key" button. From the file dialog, select the private key created above (in My Documents by default)
  3. Press "Open" on the file dialog, you will be prompted for the passphrase to load your private key. Enter in the passphrase and click OK

The private key should now be loaded and ready for use with all PuTTY-based tools. Click the close button to minimize PuTTYagent.

Warning: While Pagent is running and has your private key loaded, anyone else using your computer can access your account on Cheaha. Please LOCK your workstation when unattended!

You are now ready to begin using the Matlab Parallel Toolbox.

Linux

If you don't already have an RSA key-pair generated for your user account on your workstation, do so by running the following command (choose a good passphrase)

$ ssh-keygen -t rsa

Enter file in which to save the key (~/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 

Copy the public key to cheaha (change USERID to your cheaha login id). You will be prompted for your cheaha password.

$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub USERID@cheaha.uabgrid.uab.edu

Run ssh-agent to load the key on your Linux workstation

$ ssh-agent
Enter passphrase for ~/.ssh/id_rsa: 
Identity added: ~/.ssh/id_rsa (~/.ssh/id_rsa)

Now test that you can connect to cheaha without entering a password (again change USERID to your login name)

$ ssh USERID@cheaha.uabgrid.uab.edu uptime

 15:34:10 up 9 days,  5:11, 14 users,  load average: 0.25, 0.34, 0.50

Mac

This section will be updated shortly. In the mean time, the steps in the Linux section should be very close, if not the same.

Matlab from Your Desktop

Once SSH has been successfully configured, you are ready to continue setting up the Matlab client.

Matlab Parallel Toolbox enables you to submit your Matlab code to the cluster without leaving the graphical interface.

This section discusses the following steps:

  1. Download and extract the Matlab submission functions
  2. Start the Matlab client
  3. Add the cheaha parallel computing toolbox configuration
  4. Perform the validation tests

Matlab Submit Functions

In order for Parallel Toolbox to work with our cluster, you will have to copy the special submit functions to your Matlab PATH.

Download the appropriate zipped file and extract it to a directory in your Matlab PATH (note that both Linux and Mac use the 'unix' submit functions)

Typical Matlab PATH locations:

  • Windows: My Documents\MATLAB
  • Linux: $HOME/Documents/MATLAB
  • Mac:  ???
  • All Systems: $MATLABROOT/toolbox/local

Once the files have been unzipped / extracted to your Matlab PATH, you can start the Matlab client.

Parallel Computing Toolbox Configuration

In this section we will add the cheaha configuration to the Parallel Computing Toolbox followed by a quick validation test.

Prior to continuing, make sure that you've completed the following:

  • Configured SSH and successfully tested passwordless authentication to Cheaha
  • Loaded your SSH key
    • Windows users - means starting PuTTY Pagent and loading your SSH key
    • Linux users - means running ssh-agent to load the SSH key
    • Mac users - ???
  • Downloaded and extracted the latest Matlab submission scripts to your Matlab PATH

Start the Matlab client on your workstation (Windows users should have a Matlab icon on the desktop), Linux and Mac may have to use the command line.

  1. Click the Parallel menu
  2. Click Manage Configurations
  3. Click File -> New -> Generic

You should now have a window titled "Generic Scheduler Configuration Properties"

Make the following changes (please adjust DataLocation to an appropriate directory and substitute R2010a with R2009b or 2009a to match your client version):

  • Configuration name: cheaha
  • Description: cheaha.uabgrid.uab.edu
  • Scheduler Tab
    • Root directory of MATLAB installation for workers (ClusterMatlabRoot): /share/apps/mathworks/R2010a
    • Number of workers available to scheduler (ClusterSize): 8
    • Directory where job data is stored (DataLocation): C:\jobs\matlab
    • Function called when submitting parallel jobs (ParallelSubmitFcn): {@sgeNonSharedParallelSubmitFcn, 'cheaha.uabgrid.uab.edu', '$UABGRID_SCRATCH/jobs/matlab/ptbx'}
    • Function called when submitting distributed jobs (SubmitFcn): {@sgeNonSharedSimpleSubmitFcn, 'cheaha.uabgrid.uab.edu', '$UABGRID_SCRATCH/jobs/matlab/ptbx'}
    • Function called when canceling a job (CancelJobFcn):
    • Function called when canceling a task (CancelTaskFcn):
    • Cluster nodes' OS (ClusterOSType): unix
    • Function called when destroying a job (DestroyJobFcn): @sgeDestroyJob
    • Function called when destroying a task (DestroyTaskFcn):
    • Function called when getting the job state (GetJobStateFcn): @sgeGetJobState
    • Job data location is accessible from both client and cluster nodes (HasSharedFilesystem): False
  • Jobs Tab: Do not change
  • Tasks Tab: Do not change

The configuration should look similar to this screen shot Cheaha-parallel-config.jpg

Matlab from the Command Line

Matlab Versions

Use the 'module' command to view a list of available Matlab versions. If the version that you require isn't listed, please open a help desk ticket to request the installation.

The following is an example output of the command and doesn't necessarily represent the currently installed versions:

$ module avail mathworks

------------------------------------ /etc/modulefiles -------------------------------------
mathworks/R2009b
mathworks/R2009a

Simple Matlab Test

A simple test to verify that the Matlab client on Cheaha can check out a license from your server.

Set up your environment with the command:

$ module load mathworks/R2010a

As a test, you can run MatLab and access your license server with

$ matlab -c port@license-server -nodesktop -nojvm -r 'rand, pause(0), exit'

For example:

$ module load mathworks/R2010a
$ matlab -c 27000@licserver -nodesktop -nojvm -r 'rand, pause(0), exit'

                                     < M A T L A B (R) >
                           Copyright 1984-2010 The MathWorks, Inc.
                        Version 7.10.0.499 (R2010a) 64-bit (glnxa64)
                                      February 5, 2010

 
  To get started, type one of these: helpwin, helpdesk, or demo.
  For product information, visit www.mathworks.com.
 
>> 
ans =

    0.8147

This will start matlab without a graphical display and without Java support. This is good just to verify things work, but do not run any significant computations on the Cheaha head node!

MatLab computational work must be run on the compute nodes by submitting a job submission script to the SGE scheduler

Serial Matlab

Serial Matlab jobs have the following characteristics:

* Consumes one of your client licenses for the duration of the job
* Does not use the distributed licenses available on cheaha
* Does not require the parallel computing toolbox
* Restricted to a single CPU core (slot)

See the next section for an example using the distributed computing license.

Create a job script "matlabtest.qsub" making sure to change:

* YOUR_EMAIL_ADDRESS
* h_rt and s_rt to appropriate hard and soft runtime limits
* h_vmem to the maximum amount of memory that your job will use
$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/serial01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/serial01
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#
#$ -N serialMatlab
#$ -l h_rt=00:10:00,s_rt=00:08:00,h_vmem=2G
#$ -j y
#
#$ -M YOUR_EMAIL_ADDRESS
#$ -m eas
#
module load mathworks/R2010a
#$ -V
 
matlab -c port@license-server -nodisplay -nojvm < matlab-script

Then submit the script to the scheduler with

$ cd $UABGRID_SCRATCH/jobs/matlab/serial01
$ qsub matlabtest.qsub

Check on it with qstat.

$ qstat -u $USER

Distributed Matlab

These instructions provide an example of how to create and submit a distributed Matlab job on cheaha.

Distributed Matlab jobs use the following licenses:

  • Your own Matlab client license with the Parallel Computing Toolbox
  • The Cheaha Distributed Computing license

The client license will only be needed for as long as it takes Matlab to start the job on the compute nodes (unless you keep the client open, for example using "waitForState(job)" in your Matlab script).

The instructions are a work in progress, so please contact Research Computing support with any questions or corrections.

First, create the working directory for the job

$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/distrib01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/distrib01

Next, create a simple 2 task distributed Matlab script called "sharedDistrib01.m" make sure to change:

* email to your email address
* s_rt to an appropriate soft wall time limit
* h_rt to the maximum wall time for your job
* mem_free to the maximum memory needed for each task
* outputDirectory to the directory where results should be stored

Don't make any changes to the section labeled "Configure the scheduler"

% Always set these variables
matlab_ver      = 'R2010a';    % (Matlab release supported by your license) R2009a R2009b R2010a
email           = 'YOUREMAIL'; % your email address
email_opt       = 'eas';       % qsub email options
s_rt            = '00:05:00';  % soft wall time
h_rt            = '00:07:00';  % hard wall time
mem_free        = '1G';        % Amount of memory need per task
scratch         = getenv('UABGRID_SCRATCH');
outputDirectory  = [scratch, '/jobs/matlab/distrib01/output'];

% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'       , outputDirectory);
set(sched, 'ClusterMatlabRoot', ['/share/apps/mathworks/', matlab_ver]);
set(sched, 'HasSharedFilesystem',  true);
set(sched, 'ClusterOsType'      , 'unix');
set(sched, 'SubmitFcn', {@sgeSubmitFcn});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
sge_options = ['-pe matlab 1 -l matlab_dcs=1,mem_free=', mem_free, ',h_rt=', h_rt, ',s_rt=', s_rt, ' -m ', email_opt, ' -M ', email];
get(sched)
job = createJob(sched);

% start of user specific commands
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});

submit(job)

Running the Matlab script will submit 2 SGE single slot (CPU) jobs, one for each task. The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!

$ module load mathworks/R2010a
$ matlab -c port@license-server -nodisplay < sharedDistrib01.m

Check qstat to see that the scheduler now has 2 jobs running, one for each task

$ qstat -u $USER

job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 110839 0.50167 Job1.1     jdoe         r     03/10/2010 16:32:37 all.q@compute-0-12.local           1
 110840 0.50083 Job1.2     jdoe         r     03/10/2010 16:32:37 all.q@compute-0-12.local           1

The job output can be found in the "output" directory

Parallel Matlab

These instructions provide an example of how to create and submit a parallel Matlab job on cheaha. Parallel Matlab jobs require two separate licenses:

* Your own client license that includes the Parallel Computing Toolbox
* The Cheaha 128 node Distributed Computing license

The client license will only be needed for as long as it takes Matlab to start the job on the compute nodes (unless you keep the client open, for example using "waitForState(job)" in your Matlab script).

Check out this Matlab Help Page for a quick overview of using parallel code in your Matlab scripts.

First, create the working directory for the job

$ mkdir -p $UABGRID_SCRATCH/jobs/matlab/paralle01/output
$ cd $UABGRID_SCRATCH/jobs/matlab/parallel01

Next, create a simple 4 slot parallel Matlab script called "parjob.m" make sure to change:

* email to your email address
* s_rt to an appropriate soft wall time limit
* h_rt to the maximum wall time for your job
* mem_free to the maximum memory needed for each task
* outputDirectory to the directory where results should be stored

Don't make any changes to the section labeled "Configure the scheduler"

% Always set these variables
email           = 'YOUR_EMAIL_ADDRESS';
s_rt            = '00:05:00';
h_rt            = '00:07:00';
mem_free        = '2G';
clusterHost     = 'cheaha.uabgrid.uab.edu';
scratch         = getenv('UABGRID_SCRATCH');
outputDirectory = [scratch, '/jobs/matlab/parallel01/output'];

% Configure the scheduler
sched = findResource('scheduler', 'type', 'generic');
set(sched, 'DataLocation'       , outputDirectory);
set(sched, 'ClusterMatlabRoot', '/share/apps/mathworks/R2009b');
set(sched, 'HasSharedFilesystem',  true              );
set(sched, 'ClusterOsType'      , 'unix'             );
set(sched, 'ParallelSubmitFcn', {@sgeParallelSubmitFcn, h_rt, s_rt, mem_free, email});
set(sched, 'DestroyJobFcn', {@sgeDestroyJob});
set(sched, 'GetJobStateFcn', {@sgeGetJobState});
get(sched)
pjob = createParallelJob(sched);

% start of user specific commands
createTask(pjob, 'rand', 1, {4});
set(pjob, 'MinimumNumberOfWorkers', 4);
set(pjob, 'MaximumNumberOfWorkers', 4);

submit(pjob)

Running the Matlab script will submit 1 SGE job requesting 4 slots (cpu cores). The Parallel Computing Toolbox requires Java VM, so notice that for this job we do not include the "-nojvm" switch!

$ module load mathworks/R2009b
$ matlab -c port@license-server -nodisplay < parjob.m

Check qstat to see that the scheduler now has 2 jobs running, one for each task

$ qstat -u $USER

job-ID  prior   name	   user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 110857 0.00000 Job1	   jdoe         r     03/10/2010 17:20:08                                    4

The job output can be found in the "output" directory

ParFor Parallel Example

This example will utilize the parfor parallel loop as defined here.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox