UAB HPC Options

From Cheaha
Jump to navigation Jump to search


Attention: Research Computing Documentation has Moved
https://docs.rc.uab.edu/


Please use the new documentation url https://docs.rc.uab.edu/ for all Research Computing documentation needs.


As a result of this move, we have deprecated use of this wiki for documentation. We are providing read-only access to the content to facilitate migration of bookmarks and to serve as an historical record. All content updates should be made at the new documentation site. The original wiki will not receive further updates.

Thank you,

The Research Computing Team

The demand for HPC-based research resources at UAB has been growing rapidly for several years.

In response to this growth, UAB IT has made significant investments in shared HPC research facilities on campus in order to grow UAB's capacity for computationally intensive research.

There are several approaches to acquire HPC cycles for your research:

Classic

With the Classic approach, you acquire a cluster that is operated exclusively by you and housed in your own facility. With this approach you bear all costs associated with the operation of the cluster: acquisition, power, cooling, system administration, user support, and maintenance (including upgrades).

This is a traditional approach that has been used for in the past. It can be an attractive option for desk-side clusters but can quickly exceed the capabilities of your local infrastructure (power, cooling, and support) as demand grows.

Co-Lo Basic

With the Co-Lo Basic approach, you acquire a cluster that is operated exclusively by you but co-located in a shared computing facility. With this approach you bear most costs associated with the operation of the cluster (as above) except that power, cooling, and other facility costs are incorporated into a co-location subscription fee.

This approach helps control facilities costs by contributing to the operation of an established, shared computing facility.

Co-Lo Plus

With the Co-Lo Plus approach, you acquire a cluster and co-locate it in a shared facility, similar to Co-Lo Basic, but this approach allows you to take advantage of existing system administration and user support infrastructure. Your cluster will have a base software profile provided and maintained by staff system administrators. Software packages specific to your research can be maintained as part of the base profile, but any licensing costs will be covered by you. The co-location subscription fee for this approach includes additional charges for enhanced services.

This approach helps control many of the operating costs associated with a cluster. Your users will also benefit from established documentation for the base software profile and knowledge sharing in the community support forums.

An important cost consideration, however, is that this cluster is still a dedicated resource owned by you. As such, you will still bear operating costs related to the power and cooling used by the cluster even when the system is idle (not running any jobs) and any hardware maintenance costs.

Condo

With the Condo approach, you do not acquire a cluster per-se, instead you acquire dedicated compute slots on an existing cluster. The dedicated slots are part of an x86-64bit commodity compute cluster environment. All operating costs are covered by UAB's research computing support infrastructure. You pay a fixed cost per dedicated compute slot. When you run a job, you are given dedicated access to the number of compute slots you purchased. When you are not running a job, the compute slots return to the pool of generally available compute cycles.

This is the most cost-effective approach. It enables you to benefit from shared investments in a common compute fabric while retaining access to dedicated compute cycles. This approach has all the support benefits of Co-Lo Plus with the addition of provides access to telephone support

Best Effort

With the Best Effort approach, there are no up-front costs. You do not acquire anything but an authorization to use the shared computing infrastructure available to the UAB community. Your computing cycles will come from shared or excess capacity available from a variety of resources either at UAB or at the Alabama Supercomputing Authority.

This can be a very effective approach for infrequent HPC demand or for exploratory computation runs to gauge requirements prior to perusing one of the other options described above. The down-side of this approach is that compute cycles are scarce during peak-demand periods.

We are considering a "Best Effort Plus" option that would enable you to purchase only the compute power you need for specific job runs. Please contact hpcs@uab.edu if you are interested in such an approach.

A La Carte HPC

With the A La Carte HPC approach, we can work with you to build custom HPC solutions for specific research needs that can leverage a combination of the above approaches or leading edge . Please contact hpcs@uab.edu so that we can explore your needs.

Compute Pools

All of the approaches above can leverage UAB IT's expertise in grid computing to build a combined pool of HPC cycles to harness any combination of these resources. This client-side approach can significantly simplify the development, maintenance, and operation of your research computing work flows. Combined with the best-effort resources of national computing infrastructure, this solution can enable you to solve problems at your preferred scale with reasonable compute times.

The UAB IT shared resource, Cheaha, has a predefined compute pool (see Cheaha#GridWay) that includes all grid-accessible clusters on campus (Cheaha, Olympus, and Everest). This compute pool can be used by any Cheaha.