Data Movement: Difference between revisions

From Cheaha
Jump to navigation Jump to search
(Creating a page for Data movement)
 
(→‎Job Script: Adding a job script for data movement using rsync)
Line 11: Line 11:


===Job Script===
===Job Script===
<pre>#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --ntasks=1
#SBATCH --partition=express
#
# Time format = HH:MM:SS, DD-HH:MM:SS
#
#SBATCH --time=10:00
#
# Mimimum memory required per allocated  CPU  in  MegaBytes.
#
#SBATCH --mem-per-cpu=2048
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=YOUR_EMAIL_ADDRESS
rsync -aP SOURCE_PATH DESTINATION_PATH
</pre>
"""NOTE:""" Please change the time required and the corresponding [https://docs.uabgrid.uab.edu/wiki/SLURM#Slurm_Partitions partition] according to your need.

Revision as of 18:02, 14 December 2016

There are various tools which you can utilize to help you move data within the HPC cluster, such as mv, cp, scp etc. One of the most powerful tools for data movement on Linux is rsync, which we'll be using in our example scripts below.

Procedure

rr

Job Scripts

If the data that you are moving is large, then you should always use an interactive session or a job script for your data movement. This ensures that the process for your data movement isn't occupying login nodes for a long time, and instead is performing these operations on a compute node.

Interactive session

Job Script

#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --ntasks=1
#SBATCH --partition=express
#
# Time format = HH:MM:SS, DD-HH:MM:SS
#
#SBATCH --time=10:00
#
# Mimimum memory required per allocated  CPU  in  MegaBytes. 
#
#SBATCH --mem-per-cpu=2048
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=YOUR_EMAIL_ADDRESS

rsync -aP SOURCE_PATH DESTINATION_PATH

"""NOTE:""" Please change the time required and the corresponding partition according to your need.