Slurm: Difference between revisions

From Cheaha
Jump to navigation Jump to search
(added express partition)
(Move content for slurm to an obsolete page)
Tag: Replaced
 
(67 intermediate revisions by 8 users not shown)
Line 1: Line 1:
[http://slurm.schedmd.com/ SLURM] is a queue management system and stands for Simple Linux Utility for Resource Management. SLURM was developed at the Lawrence Livermore National Lab and currently runs some of the largest compute clusters in the world. SLURM is the primary job manager on Cheaha (BigGreen- new hardware) while GridEngine continues to be the job manager on the old hardware.
The Slurm documentation has moved to the new documentation site at https://uabrc.github.io.


SLURM is similar in many ways to GridEngine or most other queue systems. You write a batch script then submit it to the queue manager (scheduler). The queue manager then schedules your job to run on the queue (or '''partition''' in SLURM parlance) that you designate. Below we will provide an outline of how to submit jobs to SLURM, how SLURM decides when to schedule your job and how to monitor progress.
The obsolete content for the original page can be accessed via [[Obsolete: Slum]] for historical reference.
 
 
=== General SLURM Documentation ===
The primary source for documentation on SLURM usage and commands can be found at the [http://slurm.schedmd.com/ SLURM] site. If you Google for SLURM questions, you'll often see the Lawrence Livermore pages as the top hits, but these tend to be outdated.
 
A great way to get details on the SLURM commands is the man pages available from the Cheaha cluster. For example, if you type the following command:
 
<pre>
man sbatch
</pre>
you'll get the manual page for the sbatch command.
 
=== Logging on and Running Jobs from the command line ===
Once you've gone through the [https://docs.uabgrid.uab.edu/wiki/Cheaha_GettingStarted#Access_.28Cluster_Account_Request.29 account setup procedure] and obtained a suitable [https://docs.uabgrid.uab.edu/wiki/Cheaha_GettingStarted#Client_Configuration terminal application], you can login to the Cheaha system via ssh
 
  ssh '''blazerid'''@cheaha.rc.uab.edu
 
'''Existing users''' , follow these [https://docs.uabgrid.uab.edu/wiki/SSH_Key_Authentication instructions]
 
Cheaha (new hardware) run the CentOS 7 version of the Linux operating system and commands are run under the "bash" shell. There are a number of Linux and [http://www.gnu.org/software/bash/manual/bashref.html bash references], [http://cli.learncodethehardway.org/bash_cheat_sheet.pdf cheat sheets] and [http://www.tldp.org/LDP/Bash-Beginners-Guide/html/ tutorials] available on the web.
 
=== Typical Workflow ===
* Stage data to $USER_SCRATCH (your scratch directory)
* Research how to run your code in "batch" mode. Batch mode typically means the ability to run it from the command line without requiring any interaction from the user.
* Identify the appropriate resources needed to run the job. The following are mandatory resource requests for all jobs on Cheaha
** Number of processor cores required by the job
** Maximum memory (RAM) required per core
** Maximum runtime
* Write a job script specifying queuing system parameters, resource requests and commands to run program
* Submit script to queuing system (sbatch script.job)
* Monitor job (squeue)
* Review the results and resubmit as necessary
* Clean up the scratch directory by moving or deleting the data off of the cluster
 
=== Batch Job ===
'''TODO: ''' provide an explanation of what makes a batch job and why use that vs an interactive job
 
For additional information on the '''sbatch''' command execute '''man sbatch''' at the command line to view the manual.
 
==== Example Batch Job Script ====
A job consists of '''resource requests''' and '''tasks'''. The Slurm job scheduler interprets lines beginning with '''#SBATCH''' as Slurm arguments. In this example, the job is requesting to run 1 task
<pre>#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
#SBATCH --partition=short
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=$USER@uab.edu
 
srun hostname
srun sleep 60
</pre>
 
=== Interactive Session ===
Login Node (The command-line interface after you login to Cheaha ) is supposed to be used for submitting jobs and/or lighter prep work required for the job scripts. '''You are not supposed to run heavy computations on the login node'''. If you have a heavier workload to prepare for a batch job (eg. compiling code or other manipulations of data) or your compute application requires interactive control, you should request a dedicated interactive node for this work.
 
Interactive resources are requested by submitting an "interactive" job to the scheduler. Interactive jobs will provide you a command line on a compute resource that you can use just like you would the command line on the login node. The difference is that the scheduler has dedicated the requested resources to your job and you can run your interactive commands without having to worry about impacting other users on the login node.
 
Interactive jobs, that can be run on command line,  are requested with the '''srun''' command.
 
<pre>
srun --ntasks=4 --mem=4096 --time=08:00:00 --partition=medium --job-name=JOB_NAME --pty /bin/bash
</pre>
 
This command requests for 4 core (-n) with each having size 4GB  for 8 hrs (-t).
 
More advanced interactive scenarios to support graphical applications are available using [https://docs.uabgrid.uab.edu/wiki/Setting_Up_VNC_Session VNC] or X11 tunneling [http://www.uab.edu/it/software X-Win32 2014 for Windows]
 
Interactive jobs that requires running a graphical application, are requested with the '''sinteractive''' command, via '''Terminal''' on your VNC window.
 
<pre>
sinteractive --ntasks=4 --mem=4096 --time=08:00:00 --partition=medium --job-name=JOB_NAME
</pre>
 
== Job Status ==
 
To check your job status, you can use the following command
<pre>
squeue -u BLAZERID
</pre>
 
Following fields are displayed when you run '''squeue'''
<pre>
JOBID - ID assigned to your job by SLURM scheduler
PARTITION - Partition your job gets, depends upon time requested (express(max 2 hrs), short(max 12 hrs), medium(max 50 hrs), long(max 150 hrs), sinteractive(0-2 hrs))
NAME - JOB name given by user
USER - User who started the job
ST - State your job is in. The typical states are PENDING (PD), RUNNING(R), SUSPENDED(S), COMPLETING(CG), and COMPLETED(CD)
TIME - Time for which your job has been running
NODES - Number of nodes your job is running on
NODELIST - Node on which the job is running
</pre>
 
For more details on '''squeue''', go [http://slurm.schedmd.com/squeue.html here].
 
== Updates ==
 
=== 20160311 partitions & graphical interactive ===
 
Howdy, the new changes are in place. The primary focus of the changes were to:
# Change the scheduling algorithm to one that allows jobs to share compute nodes (i.e. Slurm will allocate CPU cores now instead of complete compute nodes).
#We added partitions (in SGE they were called queues) with the following characteristics (these may change over time as we tweak things):
#* express (default partition): Priority 2 :: Max Runtime 2 hours
#* short (default partition): Priority 2 :: Max Runtime 12 hours
#* medium: Priority 4 :: Max Runtime 50 hours
#* long: Priority 6 :: Max Runtime 159 hours (6 days 6 hours)
#* interactive: Priority 10 :: Max Runtime 2 hours
# In order to run a job in a partition other than "short" you'll need to specifically request it using the --partition argument (--time=48:00:00 --partition=medium)
# Graphical interactive jobs now work. You can run an interactive job using the sinteractive command, for example:
sinteractive --time=00:05:00 --job-name=sinteractiveTest --ntasks=1 --mem=1024
 
More to come and please let us know of any issues or concerns.

Latest revision as of 17:24, 30 August 2022

The Slurm documentation has moved to the new documentation site at https://uabrc.github.io.

The obsolete content for the original page can be accessed via Obsolete: Slum for historical reference.