ResearchStorage: Difference between revisions

From Cheaha
Jump to navigation Jump to search
No edit summary
(Got rid of the boring text and replaced it with more relavent text from email)
Line 1: Line 1:
Research Storage provides versatile containers for your data sets.  The containers include dedicated space available on the Cheaha HPC platform and room for maintaining back-ups.  Their flexibility allows you to easily capacity expansion.  they can support the construction of new services and applicatons into the future.
Research Storage provides versatile containers for your data sets.  The containers include dedicated space available on the Cheaha HPC platform.


The Research Storage service was developed by UAB IT's Research Computing group to support the ever increasing demand for space in modern era of data intensive science.
== How to get it ==


== Availability ==
Once you log in to [[Cheaha]], you can access your default 1TB storage container here:


As part of the initial release, Research Storage is only accessible through the Cheaha HPC platform.  Cluster users can access an automatically provisioned "default" container at the path /rstore/user/$USER/default.
  cd /rstore/user/$USER/default


== Cost ==
You can check to see how much storage you have used in your container with the command:


Research storage is charged per gigabyte used at a rate of $0.39 per Gigabyte per year.  This is an annual cost of $395 per Terabyte.
  df /rstore/user/$USER/default


=== First Terabyte Free ===
== How to use it ==


Researchers that use Cheaha receive an annual allocation of 1TB of storage with their account. This storage containers is known as their "default" container and is accessible at the path /rstore/user/$USER/default:
You can use this storage in any way that you find useful to assist you with your HPC work flows on Cheaha.  
cd /rstore/user/$USER/default


Additional containers can be provisioned according to the amount of storage needed.
But, you should still follow good high performance compute work flow recommendations and stage data sets $USER_SCRATCH during your job runs, especially if those data sets are heavily accessed or modified as part of your job operations. 
 
One near term use  would be to safely preserve important data in your $USER_SCRATCH so that it is not destroyed by the upcoming scratch file system rebuild during the May 3-10 cluster service window.
 
Follow these steps:
 
    Remove any data from $USER_SCRATCH that you no longer use or want
    Copy your remaining important data to your default container:
        rsync -a --stats $USER_SCRATCH/ /rstore/user/$USER/default/scratch
 
 
In general, a good  use this storage for keeping larger data sets on the cluster longer than the lifetime of your active computations and for stuff that is too big to fit in your home directory.
 
== How to get more ==
 
You can buy any amount of additional storage at a rate of $0.38/Gigabyte/year.  That's $395/Terabyte/year.  All we need to know is how much storage you want, for how long, and an account number.  UAB IT will bill you monthly for the storage you consume
 
== What about backups? ==
 
There is no central back up process on the cluster.  This rule includes the new Research Storage containers.
 
Each user is responsible for backing up their own data!
 
If you are not managing a back up process for data on the cluster then you do not have any back ups.
 
Please understand we do not say this out of malice or a lack of concern for your data.  Central backup processes are inherently ignorant and must assume all files are important.  This is done at the very real expense of keeping multiple copies of data.  In the context of large scale data sets which are typical of our HPC environment, this would amount to 100's of terabytes of data.  Duplicating the footprint of data for which we are already busting at the seams to support a single copy.
 
It is much better for individuals, teams, or labs and their technical support staff critical data and ensure it is backed up.
 
We understand this process can be difficult, especially if you are your own technical support staff.
 
To that end, we have a new backup service available that leverages CrashPlan, a popular commercial backup product that will help you easily back up your data on your laptop or in your lab.
 
Please contact us if you are interested in using CrashPlan to fulfill your responsibilities for backing up your data.

Revision as of 20:00, 16 April 2014

Research Storage provides versatile containers for your data sets. The containers include dedicated space available on the Cheaha HPC platform.

How to get it

Once you log in to Cheaha, you can access your default 1TB storage container here:

 cd /rstore/user/$USER/default

You can check to see how much storage you have used in your container with the command:

 df /rstore/user/$USER/default

How to use it

You can use this storage in any way that you find useful to assist you with your HPC work flows on Cheaha.

But, you should still follow good high performance compute work flow recommendations and stage data sets $USER_SCRATCH during your job runs, especially if those data sets are heavily accessed or modified as part of your job operations.

One near term use would be to safely preserve important data in your $USER_SCRATCH so that it is not destroyed by the upcoming scratch file system rebuild during the May 3-10 cluster service window.

Follow these steps:

   Remove any data from $USER_SCRATCH that you no longer use or want
   Copy your remaining important data to your default container:
       rsync -a --stats $USER_SCRATCH/ /rstore/user/$USER/default/scratch


In general, a good use this storage for keeping larger data sets on the cluster longer than the lifetime of your active computations and for stuff that is too big to fit in your home directory.

How to get more

You can buy any amount of additional storage at a rate of $0.38/Gigabyte/year. That's $395/Terabyte/year. All we need to know is how much storage you want, for how long, and an account number. UAB IT will bill you monthly for the storage you consume

What about backups?

There is no central back up process on the cluster. This rule includes the new Research Storage containers.

Each user is responsible for backing up their own data!

If you are not managing a back up process for data on the cluster then you do not have any back ups.

Please understand we do not say this out of malice or a lack of concern for your data. Central backup processes are inherently ignorant and must assume all files are important. This is done at the very real expense of keeping multiple copies of data. In the context of large scale data sets which are typical of our HPC environment, this would amount to 100's of terabytes of data. Duplicating the footprint of data for which we are already busting at the seams to support a single copy.

It is much better for individuals, teams, or labs and their technical support staff critical data and ensure it is backed up.

We understand this process can be difficult, especially if you are your own technical support staff.

To that end, we have a new backup service available that leverages CrashPlan, a popular commercial backup product that will help you easily back up your data on your laptop or in your lab.

Please contact us if you are interested in using CrashPlan to fulfill your responsibilities for backing up your data.