MATLAB DCS: Difference between revisions

From Cheaha
Jump to navigation Jump to search
(Move firewall config to MatLabPool_Config)
(Simplified page content by extracting config for older matlab versions to MatLab_DCS_R2010a_and_Earlier and examples MatLab DCS Examples)
Line 37: Line 37:
In order for Parallel Computing Toolbox to work with our cluster, you will have to copy the special submit functions to your MATLAB PATH.
In order for Parallel Computing Toolbox to work with our cluster, you will have to copy the special submit functions to your MATLAB PATH.


Please note that starting with MATLAB 2010b, the submit functions have been changed. If you are running a version prior to 2010b, please download the older functions:
These steps document the DCS configuration for MatLab 2010b and later. For DCS configuration instructions on previous versions of MatLab, please visit the page [[MatLab DCS R2010a and Earlier]]


==== MATLAB Versions R2010b and later ====
==== MATLAB Versions R2010b and later ====
Line 47: Line 47:
#* Linux:    $HOME/Documents/MATLAB
#* Linux:    $HOME/Documents/MATLAB
#* Mac:      $HOME/Documents/MATLAB
#* Mac:      $HOME/Documents/MATLAB
Once the submit function files have been downloaded and unzipped to your MATLAB PATH, start/restart the MATLAB client.
==== MATLAB Versions R2010a and prior ====
# Download the appropriate zipped file for your operating system (note that both Linux and Mac use the 'unix' submit functions)
#* [http://projects.uabgrid.uab.edu/matlab/browser/trunk/distributables/matlab-R2010a-windows-nonshared.zip?format=raw Windows MATLAB R2010a and prior Submit Functions] - '''Updated 05/13/2010'''
#* [http://projects.uabgrid.uab.edu/matlab/browser/trunk/distributables/matlab-R2010a-unix-nonshared.zip?format=raw Linux MATLAB R2010a and prior Submit Functions] - '''Updated 05/13/2010'''
#* [http://projects.uabgrid.uab.edu/matlab/browser/trunk/distributables/matlab-R2010a-unix-nonshared.zip?format=raw Mac MATLAB R2010a and prior Submit Functions] - '''Updated 05/13/2010'''
# Unzip the files to a directory on your MATLAB PATH. Typical MATLAB PATH locations are:
#* All Systems: $MATLABROOT/toolbox/local
#* Windows:    My Documents\MATLAB
#* Linux:  $HOME/Documents/MATLAB
#* Mac: ???


Once the submit function files have been downloaded and unzipped to your MATLAB PATH, start/restart the MATLAB client.
Once the submit function files have been downloaded and unzipped to your MATLAB PATH, start/restart the MATLAB client.
Line 100: Line 87:
[[Image:cheaha-parallel-config-R2010b.jpg]]
[[Image:cheaha-parallel-config-R2010b.jpg]]


==== MATLAB R2010a and Earlier ====
'''Prior to continuing''', make sure that you've completed the following:
# Configured SSH and successfully tested passwordless authentication to Cheaha
# Loaded your SSH key
#* Windows users - means starting PuTTY Pagent and loading your SSH key
#* Linux users - means running ssh-agent to load the SSH key
#* Mac users - same as for Linux, run ssh-agent
# Downloaded and extracted the latest MATLAB submission scripts to your MATLAB PATH
Now that the prerequisites are complete:
# Download and save the [http://projects.uabgrid.uab.edu/matlab/browser/trunk/parallel-configs/cheaha-R2010a.mat?format=raw Cheaha cluster configuration file for MATLAB R2010a and prior], this will be imported into MATLAB further down
# Start the MATLAB client on your workstation (Windows users should have a MATLAB icon on the desktop), Linux may have to use the command line. On Mac, from '''Finder''' click on the MATLAB icon located at '''/Applications/'''
# Click the Parallel menu
# Click Manage Configurations
# Click File -> Import
# Browse to the location where you saved the cheaha-R2010a.mat file select it and click Open
The Configuration Manager should now list a new entry for cheaha.
[[Image:Matlab-configuration-manager.jpg]]
You may have to change the following settings (double click on cheaha in the Configuration Manager window):
* Root directory of MATLAB installation for workers (ClusterMatlabRoot): '''/share/apps/mathworks/R2010a'''
* Directory where job data is stored (DataLocation): '''C:\jobs\matlab'''
The configuration will look similar to this screen shot
[[Image:cheaha-parallel-config.jpg]]
=== Examples ===
==== Parfor ====
This example uses two files:
* myWave.m - a script with a parfor loop generates a wave form
* rParforWave.m - the submission script
The job will use 4 total slots/ MATLAB workers on the cluster (3 workers plus the master worker process).
<ol><li>Create the myWave.m script containing this code
<pre>
parfor i=1:1024
  A(i) = sin(i*2*pi/1024);
end
</pre>
<li>Next create the rParforWave.m script making sure to change the '''YOUREMAIL''' string to a working email address</li>
<pre>
% Always set these variables
email          = 'YOUREMAIL';
email_opt      = 'eas';      % qsub email options
s_rt            = '00:05:00'; % soft wall time
h_rt            = '00:07:00'; % hard wall time
vf              = '1G';      % Amount of memory need per task
min_cpu_slots  = 2;          % Min number of cpu slots needed for the job
max_cpu_slots  = 3;          % Max number of cpu slots needed for the job
% Configure the scheduler - Do NOT modify these
sge_options = ['-l vf=', vf, ',h_rt=', h_rt, ',s_rt=', s_rt, ' -m ', email_opt, ' -M ', email];
SGEClusterInfo.setExtraParameter(sge_options);
sched = findResource();
% End of scheduler configuration
% start of user specific commands
job = batch('myWave', 'matlabpool', max_cpu_slots, 'FileDependencies', {'myWave.m'});
% The following commands can be run once the job is submitted to view the results
% >> waitForState(job)
% >> load(job, 'A')
% >> plot(A)
% Once the job is complete, permanently remove its data
% >> destroy(job)
</pre>
<li>Select cheaha as the parallel configuration by clicking '''Parallel -> Select Configuration -> cheaha''' in the main MATLAB window
<li>Run the rParforWave.m code by opening the script in the MATLAB editor and clicking the green run arrow
<li>After several seconds you should see output similar to the following in the MATLAB Command Window
<ul><li>The '''job 294773''' is the job number assigned by the Cheaha scheduler
<li>The '''Job1''' is the job name and number used by MATLAB to reference the job. In most cases, this is the number that you'll use to interact with MATLAB to load the results, clean up the job, etc...
<pre>
Your job 294773 ("Job1") has been submitted
</pre></ul>
<li>Now that the job has been submitted, instruct MATLAB to wait for the job to complete using [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/waitforstate.html waitForState]. MATLAB will show as 'Busy' until the job completes, at which time the >> prompt will appear. You can verify that the job is complete by running the job.State function call.
<pre>
>> waitForState(job)
>> job.State
ans =
finished
</pre>
<li>Now that the job has completed, to view the results first use the [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/load.html load] function to load the workspace variable '''A''' from our batch job:
<pre>
>> load(job, 'A')
</pre>
<li>Next, display the [http://www.mathworks.com/access/helpdesk/help/techdoc/ref/plot.html plot]
<pre>
>> plot(A)
</pre>
[[Image:RParforWave-plot.png]]
<li>Once you are done with the job, make sure to run the [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/destroy.html destroy] the job to clean up the space used on the cluster
<pre>
>> job.destroy
</pre>
</ol>
In the case of longer running jobs, you probably don't want to tie up your MATLAB client by using [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/waitforstate.html waitForState].
This will allow you to perform other tasks in MATLAB or exit entirely without having to wait for your job to complete.
MATLAB provides a function to load a previously submitted job back into the workspace, [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/findjob.html findJob].
In the example above, our MATLAB job name was '''Job1''' and from that we can deduce that the MATLAB job number was '''1'''. It can be loaded as follows (make sure the correct Parallel configuration is selected):
<pre>
>> sched = findResource();
>> job = findJob(sched, 'ID', 1);
>> job.State
ans =
finished
</pre>


It is important to clean up after the job using [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/destroy.html destroy] to free up hard disk space on both your desktop and the cluster. Any output that is to be saved should be copied to another location on your Desktop prior to running [http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/destroy.html destroy].
<pre>
>> findJob(sched, 'ID', 1)
>> job.destroy
</pre>
{{MATLAB Support}}
{{MATLAB Support}}


[[Category:MATLAB]]
[[Category:MATLAB]]

Revision as of 23:40, 6 September 2011

The MATLAB Distributed Computing Server (MATLAB DCS) is a parallel computing extension to MATLAB that enables processing to be spread across a large number of worker nodes, accelerating the speed at which compute intensive operations can complete.

UAB IT Research Computing maintains a 128 worker node license for the the Cheaha computing platform. In order to use DCS on Cheaha, you will need to [[MATLAB#Using_MATLAB|use a MATLAB instance] with the Parallel Computing Toolbox installed.

In order to leverage the MATLAB worker nodes on Cheaha the Parallel Computing Toolbox will need to be configured to communicate with Cheaha by following the steps in this document.

Please see the MATLAB application page for more information and a general overview of MATLAB and its use at UAB.

References

Parallel Computing Toolbox User's Guide - This is the official 655 page MATLAB user guide for the Parallel Computing Toolbox and is recommended reading!

Overview

The following is an overview of the process to run MATLAB jobs on cheaha using the Distributed Computing Server:

  • Initial Setup
    • Install and license both MATLAB client and Parallel Computing Toolbox on your Windows / Linux / Mac workstation
    • Configure SSH on your workstation (see the section on 'SSH Keys' below)
    • Download and extract the MATLAB submission functions to your MATLAB PATH
    • Add Cheaha to the Parallel Computing Toolbox configuration
  • Running Jobs
    • Write, test and debug your parallel code on your local workstation
    • Once the code is ready for production, run it on Cheaha using the Parallel Computing Toolbox
    • After the job completes (you will receive an email), retrieve the results
    • Destroy the job related content on cheaha to clean up

MATLAB from Your Desktop

Once SSH has been successfully configured, you are ready to continue setting up the MATLAB client.

MATLAB Parallel Computing Toolbox enables you to submit your MATLAB code to the cluster without leaving the graphical interface.

This section discusses the following steps:

  1. Download and extract the MATLAB submission functions
  2. Start the MATLAB client
  3. Add the cheaha Parallel Computing Toolbox configuration
  4. Perform the validation tests

MATLAB Submit Functions

In order for Parallel Computing Toolbox to work with our cluster, you will have to copy the special submit functions to your MATLAB PATH.

These steps document the DCS configuration for MatLab 2010b and later. For DCS configuration instructions on previous versions of MatLab, please visit the page MatLab DCS R2010a and Earlier

MATLAB Versions R2010b and later

All operating systems (Windows, Linux and Mac) are supported by a single set of submit functions:

  1. Download the MATLAB submit functions for R2010b and later
  2. Unzip the files to a directory on your MATLAB PATH. Typical MATLAB PATH locations are:
    • Windows: My Documents\MATLAB
    • Linux: $HOME/Documents/MATLAB
    • Mac: $HOME/Documents/MATLAB

Once the submit function files have been downloaded and unzipped to your MATLAB PATH, start/restart the MATLAB client.

NOTE: Your MATLAB PATH may be viewed/altered by starting the MATLAB client on your workstation and clicking File -> Set Path

Parallel Computing Toolbox Configuration

In this section we will add the cheaha configuration to the Parallel Computing Toolbox followed by a quick validation test.

Please follow the instructions for the appropriate version of MATLAB

MATLAB R2010b and Later

Prior to continuing, make sure that you've completed the following:

  1. Configured SSH and successfully tested passwordless authentication to Cheaha
  2. Loaded your SSH key
    • Windows users - means starting PuTTY Pagent and loading your SSH key
    • Linux users - means running ssh-agent to load the SSH key
    • Mac users - same as for Linux, run ssh-agent
  3. Downloaded and extracted the latest MATLAB submission scripts to your MATLAB PATH

Now that the prerequisites are complete:

  1. Download and save the Cheaha cluster configuration file, this will be imported into MATLAB further down
  2. Start the MATLAB client on your workstation (Windows users should have a MATLAB icon on the desktop), Linux may have to use the command line. On Mac, from Finder click on the MATLAB icon located at /Applications/
  3. Click the Parallel menu
  4. Click Manage Configurations
  5. Click File -> Import
  6. Browse to the location where you saved the cheaha.mat file select it and click Open

The Configuration Manager should now list a new entry for cheaha. Matlab-configuration-manager.jpg

  • Double click on cheaha in the Configuration Manager window to open the configuration window
    • Stretch the window to the right so that you can view all of the text in the SubmitFcn field (makes it easier to modify)
    • ClusterMatlabRoot: If necessary, change the Root directory of MATLAB installation for workers to the correct version, R2010b in this example: /share/apps/mathworks/R2010b
    • DataLocation  : Change the local directory where job data is stored to an existing directory on your workstation: C:\jobs\matlab
    • ParallelSubmitFcn: Change "YOURUSERID" to your Cheaha user id
    • SubmitFcn  : Change "YOURUSERID" to your Cheaha user id

The configuration will look similar to this screen shot.

Cheaha-parallel-config-R2010b.jpg


MATLAB Support / Mailing List

As with any application or computer language, learning to use MATLAB to analyze data or to develop or modify MATLAB applications is an individual responsibility. There is ample application documentation available from the Mathworks website, potential outreach to colleagues who also use MATLAB, and options for consultation with Mathworks. Mathworks also host on-campus training seminars several times a year and provides many on-line learning tutorials.

Installation support for MATLAB at UAB is provided by your local IT support organization and the Docs wiki.

Mathworks Website

Your first and best option for application-specific questions on MATLAB is to refer to the on-line MATLAB documentation. The Mathworks site also provides a a support matrix and an on-line knowledge base.

UAB MATLAB Wiki

The MATLAB page on the Docs wiki is the starting point for installing MATLAB at UAB and, optionally, configuring it to use cluster computing. All users are encouraged to contribute to the MATLAB knowledge in this wiki, especially if you see areas where improvements are needed. Remember, this knowledge base is only as good as the people who contribute to it.

Contributing to the wiki is as easy as clicking the login link on the top-right of the page and signing in with your UAB BlazerID. If you are unsure about making an edit, you can make suggestions for improvement on the page's Discussion tab or discuss the proposed improvement in the MATLAB user group.

UAB MATLAB User Group

At UAB, MATLAB installation support is provided by your local IT support group. Support for application specific questions is available from peers in your research group. We realize that some people are not as familiar with MATLAB as others. For this reason, we have established a MATLAB user forum (mailing list) where users of MATLAB at UAB can help answer each others questions.

This is a network of volunteers sharing their knowledge with peers. You are encouraged to reach out to this community for questions on using MATLAB by

Archives of MATLAB user group discussions are available on-line at https://vo.uabgrid.uab.edu/sympa/arc/matlab-user. You may find your question is already answered in these archives.


UAB MATLAB announce mailing list

To receive information about UAB's MATLAB license and announcements please subscribe to the matlab-annc mailing list by