ResearchStorage: Difference between revisions

From Cheaha
Jump to navigation Jump to search
(→‎How to use it: Add subsections and improve formatting)
(→‎How to get more: Changed the wording to not imply we are tracking the use time period.)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
Research Storage provides versatile containers for your data sets.  The containers include dedicated space available on the Cheaha HPC platform.
Research Storage provides versatile containers for your data sets.  The containers include dedicated space available on the Cheaha HPC platform.
== What do I get ==
Each active user on Cheaha receives an annual 1-terabyte allocation of Research Storage at no direct cost to the user.  This container is dedicated to your storage needs and is attached directly to the cluster.
Additional storage capacity can be purchased, see details below.


== How to get it ==
== How to get it ==
Line 9: Line 15:
You can check to see how much storage you have used in your container with the command:
You can check to see how much storage you have used in your container with the command:


   df /rstore/user/$USER/default
   df -h /rstore/user/$USER/default


== How to use it ==
== How to use it ==
Line 16: Line 22:
You can use this storage in any way that you find useful to assist you with your HPC work flows on Cheaha.  
You can use this storage in any way that you find useful to assist you with your HPC work flows on Cheaha.  


You should still follow good high performance compute work flow recommendations and stage data sets $USER_SCRATCH during your job runs, especially if those data sets are heavily accessed or modified as part of your job operations.   
You should still follow good high performance compute work flow recommendations and stage data sets in your $USER_SCRATCH during your job runs, especially if those data sets are heavily accessed or modified as part of your job operations.   


In general, a good  use this storage for keeping larger data sets on the cluster longer than the lifetime of your active computations and for stuff that is too big to fit in your home directory.
In general, a good  use of this storage is for keeping larger data sets on the cluster longer than the lifetime of your active computations and for stuff that is too big to fit in your home directory.


=== For Retaining Scratch Data ===
=== For Retaining Scratch Data ===


One near term use for your storage container would be to safely preserve important data in your $USER_SCRATCH so that it is not destroyed by the upcoming scratch file system rebuild during the May 3-10 cluster service window.
One near term use for your storage container would be to safely preserve important data in your $USER_SCRATCH so that it is not destroyed by the upcoming scratch file system rebuild. These destructive data operations are scheduled during the May 3-10 2014 cluster service window.


You can follow these steps to move a copy of important files from your personal scratch space to your default Research Storage container:
Follow these steps to move a copy of important files from your personal scratch space to your default Research Storage container:
# Remove any data from $USER_SCRATCH that you no longer use or want
# Remove any data from $USER_SCRATCH that you no longer use or want
# Copy your remaining important data to your default container using rsync
# Copy your remaining important data to your default Research Storage container using rsync
   rsync -a --stats $USER_SCRATCH/ /rstore/user/$USER/default/scratch
   rsync -a --stats $USER_SCRATCH/ /rstore/user/$USER/default/scratch


== How to get more ==
== How to get more ==


You can buy any amount of additional storage at a rate of $0.38/Gigabyte/year.  That's $395/Terabyte/year.  All we need to know is how much storage you want, for how long, and an account number.  UAB IT will bill you monthly for the storage you consume
You can buy any amount of additional storage at a rate of $0.38/Gigabyte/year.  That's $395/Terabyte/year.  All we need to know is how much storage you want and a billing account number.  UAB IT will bill you monthly for the storage you consume.  When you are done using the storage, let us know, we will delete the storage and stop billing you for it.


== What about backups? ==
== What about backups? ==


=== No Backups! ===
=== No Backups! ===
There is no central back up process on the cluster. Each user is responsible for backing up their own data. If you are not managing a back up process for data on the cluster then you do not have any back ups.
There is no central back up process on the cluster. Each user is responsible for backing up their own data. If you are not managing a back up process for your data on the cluster then you do not have any backups.


''This rule includes the new Research Storage containers.''
''This rule also applies to the new Research Storage containers.''


Please understand we do not say this out of malice or a lack of concern for your data.  Central backup processes are inherently ignorant and must assume all files are important.  This is done at the very real expense of keeping multiple copies of data.  In the context of large scale data sets which are typical of our HPC environment, this would amount to 100's of terabytes of data.  Duplicating the footprint of data for which we are already busting at the seams to support a single copy.
Please understand we do not say this out of malice or a lack of concern for your data.  Central backup processes are inherently ignorant and must assume all files are important.  This is done at the very real expense of keeping multiple copies of data.  In the context of large scale data sets, which are typical in our HPC environment, this would amount to storing 100's of terabytes of data.  It would duplicating the footprint of data for which we are already bursting at the seams in order to support a single copy.


It is much better for individuals, teams, or labs and their technical support staff critical data and ensure it is backed up.
It is much better for individuals, teams, or labs and their technical support staff to identify critical data and ensure it is backed up.


=== How to Backup ===
=== How to Backup ===
Line 49: Line 55:
To that end, we have a new backup service available that leverages CrashPlan, a popular commercial backup product that will help you easily back up your data on your laptop or in your lab.
To that end, we have a new backup service available that leverages CrashPlan, a popular commercial backup product that will help you easily back up your data on your laptop or in your lab.


Please contact us if you are interested in using CrashPlan to fulfil your responsibilities for backing up your own data.
Please contact us if you are interested in using CrashPlan to fulfil your responsibilities toward backing up your own data.

Latest revision as of 16:38, 29 April 2014

Research Storage provides versatile containers for your data sets. The containers include dedicated space available on the Cheaha HPC platform.

What do I get

Each active user on Cheaha receives an annual 1-terabyte allocation of Research Storage at no direct cost to the user. This container is dedicated to your storage needs and is attached directly to the cluster.

Additional storage capacity can be purchased, see details below.

How to get it

Once you log in to Cheaha, you can access your default 1TB storage container here:

 cd /rstore/user/$USER/default

You can check to see how much storage you have used in your container with the command:

 df -h /rstore/user/$USER/default

How to use it

For HPC work flows

You can use this storage in any way that you find useful to assist you with your HPC work flows on Cheaha.

You should still follow good high performance compute work flow recommendations and stage data sets in your $USER_SCRATCH during your job runs, especially if those data sets are heavily accessed or modified as part of your job operations.

In general, a good use of this storage is for keeping larger data sets on the cluster longer than the lifetime of your active computations and for stuff that is too big to fit in your home directory.

For Retaining Scratch Data

One near term use for your storage container would be to safely preserve important data in your $USER_SCRATCH so that it is not destroyed by the upcoming scratch file system rebuild. These destructive data operations are scheduled during the May 3-10 2014 cluster service window.

Follow these steps to move a copy of important files from your personal scratch space to your default Research Storage container:

  1. Remove any data from $USER_SCRATCH that you no longer use or want
  2. Copy your remaining important data to your default Research Storage container using rsync
  rsync -a --stats $USER_SCRATCH/ /rstore/user/$USER/default/scratch

How to get more

You can buy any amount of additional storage at a rate of $0.38/Gigabyte/year. That's $395/Terabyte/year. All we need to know is how much storage you want and a billing account number. UAB IT will bill you monthly for the storage you consume. When you are done using the storage, let us know, we will delete the storage and stop billing you for it.

What about backups?

No Backups!

There is no central back up process on the cluster. Each user is responsible for backing up their own data. If you are not managing a back up process for your data on the cluster then you do not have any backups.

This rule also applies to the new Research Storage containers.

Please understand we do not say this out of malice or a lack of concern for your data. Central backup processes are inherently ignorant and must assume all files are important. This is done at the very real expense of keeping multiple copies of data. In the context of large scale data sets, which are typical in our HPC environment, this would amount to storing 100's of terabytes of data. It would duplicating the footprint of data for which we are already bursting at the seams in order to support a single copy.

It is much better for individuals, teams, or labs and their technical support staff to identify critical data and ensure it is backed up.

How to Backup

We understand this process can be difficult, especially if you are your own technical support staff.

To that end, we have a new backup service available that leverages CrashPlan, a popular commercial backup product that will help you easily back up your data on your laptop or in your lab.

Please contact us if you are interested in using CrashPlan to fulfil your responsibilities toward backing up your own data.