MATLAB workshop 2011

From Cheaha
Revision as of 16:25, 21 September 2011 by Tanthony@uab.edu (talk | contribs)
Jump to navigation Jump to search


Attention: Research Computing Documentation has Moved
https://docs.rc.uab.edu/


Please use the new documentation url https://docs.rc.uab.edu/ for all Research Computing documentation needs.


As a result of this move, we have deprecated use of this wiki for documentation. We are providing read-only access to the content to facilitate migration of bookmarks and to serve as an historical record. All content updates should be made at the new documentation site. The original wiki will not receive further updates.

Thank you,

The Research Computing Team

Presenter - Thomas Anthony

UAB IT Research Computing

tanthony@uab.edu

Introduction

MATLAB (matrix laboratory) is a numerical computing environment and fourth-generation programming language. Developed by Mathworks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, and Fortran. An additional package, Simulink, adds graphical multi-domain simulation and Model-Based Design for dynamic and embedded systems.


In January 2011, UAB acquired a site license for MATLAB that allows faculty, staff, post-docs, and graduate students to use MATLAB, Simulink, and 42 toolboxes (including the parallel toolbox) for research activities on campus and personal systems. Additionally, MATLAB is available to students on campus computer systems.


Install and Configure MATLAB

Using Mathworks software available under the UAB campus license on your computer involves download and install steps common to all software packages and an authorization step that grants you the rights to use the software under the campus agreement.

Installation

Installation Overview

NOTE:These steps are common to all install scenarios and are detailed in Downloading and Installing MATLAB.

  1. Create an account at the Mathworks site using your campus @uab.edu email address.
  2. Request an activation key.
  3. Associate your Mathworks account with the campus-wide MATLAB license using your activation key.
  4. Download the software from the mathworks download site and install MATLAB. (Contact a MATLAB-TAH Asset Manager to get download rights)
  5. Activate the software using the activation scenario that best suits your particular needs.

Installation for Various Activation Scenarios

NOTE: Most on-campus users are encouraged to use the Simplified MATLAB Install option for activation unless there are special circumstances that require the alternative activation scenarios.

  1. Simplified MATLAB Install - This is the recommended install when MATLAB will be used on computers that remain connected to the campus network. This installation requires MatLab software to be installed on your computer and provides a simple 2-line file to activate the software. This option is highly recommend.
  2. Matlab Designated Computer Install - This option is recommended for mobile computing systems which may not have network access when MATLAB is being used. This install type authorizes an individual computer to run MATLAB, allowing MATLAB to run regardless of where the computer is located.


Configure MATLAB

Configuration Overview

Configuring the Parallel Computing Toolbox involves three steps documented below:

  1. install MATLAB submit functions on your workstation
  2. configure the "cheaha" parallel computing target to which PCT tasks can be submitted
  3. run the validation tests to confirm a working installation.

This page documents the DCS configuration for MATLAB 2010b and later. For DCS configuration instructions on previous versions of MATLAB, please see the page MatLab DCS R2010a and Earlier

Using MATLAB DCS requires you have a cluster account on Cheaha. Please request an account by sending an email to [[1]] and include your campus affiliation and a brief statement of your research interests for using the cluster.

MATLAB Submit Functions

The MATLAB submit functions create a cluster job context for your code and are responsible for transferring your code and the data it analyzes to the cluster for processing.

These submit functions must be installed on your computer and must be accessible to MATLAB via the MATLAB PATH environment. The easiest way to accomplish this is to copy the submit functions to the default directory created for by MATLAB. These directories on the respective operating systems are listed below.

  1. Download the MATLAB submit functions
  2. Unzip the files to a directory included in your MATLAB PATH setting. Recommended locations are:
    • Windows:
      My Documents\MATLAB
    • Linux:
      $HOME/Documents/MATLAB
    • Mac:
      $HOME/Documents/MATLAB

Once the submit function files have been downloaded and unzipped in the above paths, restart MATLAB to ensure they are properly loaded in your environment.


Parallel Computing Toolbox Configuration

The Parallel Computing Toolbox (PCT) enables language extensions in MATLAB that support dividing your application into tasks that can be executed in parallel. By default, all of these tasks will run on your local workstation using the pre-defined "local" PCT configuration.

To run these tasks on the Cheaha compute cluster, a new configuration for the PCT must be defined. In this section we will create the "cheaha" configuration and run a quick validation test to confirm its operation.

Prior to continuing, make sure you:

  • can establish an SSH connection to Cheaha
  • have followed the steps in the previous section
Create the "cheaha" PCT Configuration
  1. Download and save the Cheaha cluster configuration file in a file named "cheaha.mat".
  2. Start MATLAB on your workstation
  3. Click the "Parallel" menu
  4. Click "Manage Configurations"
  5. In the "Configurations Manager" window, click "File -> Import"
  6. Browse to the location where you saved the cheaha.mat file, select it, and click "Open"

The Configuration Manager should now list a new entry named "cheaha" as shown in the following image:

2011 config mngr.png
Personalize the "cheaha" PCT Configuration
  1. Double click on cheaha in the Configuration Manager window to open the configuration editor. (Note: stretch the "Generic Scheduler Configuration Properties" window to the right so that you can view all of the text in the fields making it easier to read and edit correctly.)
  2. Edit the following fields to use your personal data directories
    • ClusterMatlabRoot: Make sure that the Root directory of MATLAB installation for workers matches the exact version of MATLAB you are using on your workstation. In this example /share/apps/mathworks/R2011a matches a MATLAB R2011a workstation install. Change the "R2011a" to match your workstation MATLAB install.
    • DataLocation  : Change the directory path where job data is stored to an existing directory on your workstation. For example, on Windows the directory C:\Users\<USERNAME>\Documents\MATLAB is created by default by MATLAB. Please confirm this directory is valid.
    • ParallelSubmitFcn: Change the text "YOURUSERID" to your login id on Cheaha
    • SubmitFcn  : Change the text "YOURUSERID" to your login id on Cheaha
  3. Click 'OK'to save the configuration

The initial configuration will look similar to this screen shot. You will need to edit the fields as describe in the preceding steps before you can use the configuration. NOTE: be sure to replace the template user name settings "YOURUSERNAME" with the appropriate settings for your desktop and cluster account.

Cheaha parallel config.png
Validate the "cheaha" PCT Configuration
  1. Select Cheaha on the configuration manager page and click 'Start Validation'
  2. Wait for the validation to complete. This might take a few minutes and you ask for User credentials on Cheaha. All tests other than 'Matlabpool' validate on the Cheaha and the output is as shown.
Validation.png

Validation must pass the first three stages to use MATLAB on Cheaha.

Workshop Demo's

  1. Serial job
  2. Offload the serial job to Cheaha
  3. Convert serial job to parallel and run it locally
  4. Offload the parallel job to Cheaha
  5. Distributed Job
  6. Small shell script using MATLAB

Serial Job

localScript.m

tic
close all


% pre allocate space and variables
l=[];
c=1;
out=[];
diff=200;

% main working loop
check=90000:diff:91000;
for c=1:numel(check)   % check no. of elements and feed to for loop  
    check(c)                % display which loop is working
    [l(c),count(c),ratio(c)]=primenofun(check(c));  % run function and get back output
    
    
end


final_time = toc

primenofun.m

 function [l,count,ratios,q]=primenofun(final)


% preallocate variables
diff=200;
close all
count=[];
ratios=[];
l=[];
out=[];
start=0;
a=0;

% for loop
for i=start:final

   c=isprime(i);  %check if every number between start & final is prime
    if c==1        % if it is , counts it as a prime number 
        i;
        a=a+c;
    end
end
a;

ratio=a/final;   % gets the ratio of prime numbers to total numbers
l=[l,final];
count=[count,a]; % total count of prime numbers 
ratios=[ratios,ratio];


out=[l',count',ratios']

Offload Serial Job to Cheaha using Submit Script

serialSubmit.m

%always set these variables
%matlab_ver      = 'R2011b';    % (MATLAB release supported by your license) R2009a R2009b R2010a
email           = 'YOUREMAILID'; % your email address
email_opt       = 'a';       % qsub email options
h_rt            = '1:07:00'; % hard wall time (time required to run this job)
vf              = '2G';   % Amount of memory need per task



disp('Please wait.. Sending job data to the Cluster.... ')

% Configure the scheduler - Do NOT modify these
%sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email, ' -q ' , queue ];
sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email ];
SGEClusterInfo.setExtraParameter(sge_options);
sched = findResource();

% End of scheduler configuration
get(sched)
job1 = createJob(sched);
tic


% start of user specific commands (Please insert the m file to be submitted
% to Cheaha instead of USERmFILE and any other functions or files your
% scripts depends on in USERFUNCTION

job1 = batch('USERmFILE', 'FileDependencies', {'USERFUNCTION.m'});

disp('Job submitted..')
datestr(clock)


% The following commands can be run once the job is submitted to view the results
disp ('Job sent to the cluster')
disp('USE >> waitForState(job)       to wait for job to be finished')
disp('USE >> job.State               to see job state')
disp('USE >> load(job,variable)      to load variables back in the workspace      OR') 
disp('USE >> results = getAllOutputArguments(job)      to load variables back in the workspace      AND') 
disp('USE >> results{:}               to see the results')

% waitForState(job2, 'finished');
% y = getAllOutputArguments(job2)
% datestr(clock)
% save newout.mat
% 
% toc


Convert Serial Job to Parallel and run it locally

  1. Convert 'for' loop to a 'parfor'
  2. Start a matlabpool using 'matlabpool open local #no_of_workers' (General rule is no_of_workers = no. of processor cores available)
  3. Run the job
tic
close all


% pre allocate space and variables
l=[];
c=1;
out=[];
diff=200;

% main working loop
check=90000:diff:91000;
parfor c=1:numel(check)   % check no. of elements and feed to for loop  
    check(c)                % display which loop is working
    [l(c),count(c),ratio(c)]=primenofun(check(c));  % run function and get back output
    
    
end


final_time = toc 


Offload the Parallel Job to Cheaha

parallelSubmit.m

%always set these variables
%matlab_ver      = 'R2011b';    % (MATLAB release supported by your license) R2009a R2009b R2010a
email           = 'YOUREMAILID'; % your email address
email_opt       = 'a';       % qsub email options
h_rt            = '1:07:00'; % hard wall time
vf              = '2G';   % Amount of memory need per task
%queue           = 'sipsey.q' % specify queue
%min_cpu_slots   = 15;          % Min number of cpu slots needed for the job
max_cpu_slots   = 7;          % Max number of cpu slots needed for the job

disp('Please wait.. Sending job data to the Cluster.... ')

% Configure the scheduler - Do NOT modify these
%sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email, ' -q ' , queue ];
sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email ];
SGEClusterInfo.setExtraParameter(sge_options);
sched = findResource();
% End of scheduler configuration
get(sched)
job2 = createJob(sched);
tic
% start of user specific commands
job2= batch('USERmFile', 'matlabpool', max_cpu_slots, 'FileDependencies', {'USERFUNCTIONS'});

disp('Job submitted..')
datestr(clock)
% The following commands can be run once the job is submitted to view the results
disp ('Job sent to the cluster')
disp('USE >> waitForState(job)       to wait for job to be finished')
disp('USE >> job.State               to see job state')
disp('USE >> load(job,variable)      to load variables back in the workspace      OR') 
disp('USE >> results = getAllOutputArguments(job)      to load variables back in the workspace      AND') 
disp('USE >> results{:}               to see the results')
% waitForState(job2, 'finished');
% 
% y = getAllOutputArguments(job2)
% datestr(clock)
% save newout.mat
% 
% toc
 

Distributed Job

distribSubmit.m

%always set these variables
%matlab_ver      = 'R2011b';    % (MATLAB release supported by your license) R2009a R2009b R2010a
email           = 'YOUREMAILID'; % your email address
email_opt       = 'a';       % qsub email options
h_rt            = '1:07:00'; % hard wall time
vf              = '2G';   % Amount of memory need per task
%queue           = 'sipsey.q' % specify queue


disp('Please wait.. Sending job data to the Cluster.... ')

% Configure the scheduler - Do NOT modify these
%sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email, ' -q ' , queue ];
sge_options = ['-l vf=', vf, ',h_rt=', h_rt,' -m ', email_opt, ' -M ', email ];
SGEClusterInfo.setExtraParameter(sge_options);
sched = findResource();
% End of scheduler configuration

get(sched)
job3 = createJob(sched);

% start of user specific commands
diff=200;
for check=90000:diff:91000
%createTask(job3, @primenofun,4, {check});
createTask(job3, @USERFUNCTION,4, {check});

end

submit(job3);
disp('Job submitted..')
datestr(clock)


% waitForState(job2, 'finished');
% 
% y = getAllOutputArguments(job2)
% datestr(clock)
% save newout.mat
% 

 

Shell Script Using MATLAB

varpass.m

clc
clear all;
cmd='bash matlabtest';

output=[];

 %fopen('output.txt','w');
for c=1:5
     c;
a=sprintf('%d',c)

[status, out]=system([cmd,' ',a])


end 

matlabtest

#!/bin/bash
#echo "first parameter is $1"

total=$(echo "scale=4; $1+2+3" |bc)

#echo "Output $1 = $total"
echo "$total" 
#$total

 

chmod +x matlabtest

./matlabtest 10