Cheaha:Community Portal: Difference between revisions

From Cheaha
Jump to navigation Jump to search
Line 68: Line 68:
**Developed [http://www.uab.edu/it/cyberinfrastructure IT CyberInfrastructure] presentation for ASA campus visit on April 3, 2007   
**Developed [http://www.uab.edu/it/cyberinfrastructure IT CyberInfrastructure] presentation for ASA campus visit on April 3, 2007   
**Circulated IT research computing planning draft to the Office of VP of Research and Economic Development
**Circulated IT research computing planning draft to the Office of VP of Research and Economic Development
== Research Computing Web Pages ==
== '''Research Computing Web Pages''' ==
=== Campus Network ===
=== Campus Network ===
*On Campus High Speed Network Connectivity The core of the campus network is a centrally-managed  backbone comprised of  ring protected enterprise-class 10-Gigabit Ethernet routers, supporting IP, IP Multicast, IPX, and Appletalk protocols. All buildings on campus are connected to one of three communication hubs using optical fiber. Within buildings, Category 5 or higher unshielded twisted pair wiring connects desktops to the network. A Gigabit Ethernet building backbone over multimode optical fiber is used for multifloor buildings.  Computer server clusters are connected to the building entrance using Gigabit Ethernet.  Each floor contains one or more switches connected to the building backbone using Gigabit Ethernet. Desktops are connected at 10 or 100 megabits/second speed (gigabit available when needed).
*On Campus High Speed Network Connectivity The core of the campus network is a centrally-managed  backbone comprised of  ring protected enterprise-class 10-Gigabit Ethernet routers, supporting IP, IP Multicast, IPX, and Appletalk protocols. All buildings on campus are connected to one of three communication hubs using optical fiber. Within buildings, Category 5 or higher unshielded twisted pair wiring connects desktops to the network. A Gigabit Ethernet building backbone over multimode optical fiber is used for multifloor buildings.  Computer server clusters are connected to the building entrance using Gigabit Ethernet.  Each floor contains one or more switches connected to the building backbone using Gigabit Ethernet. Desktops are connected at 10 or 100 megabits/second speed (gigabit available when needed).
Line 77: Line 77:
This network allows very high speed secure connectivity between the existing HPC clusters in Engineering and Computer science as well as high speed file transfer of very large data sets, between clusters, without the concerns of interfering with other traffic on the campus backbone. This dedicated connection also guarantees a predictable latency between the clusters
This network allows very high speed secure connectivity between the existing HPC clusters in Engineering and Computer science as well as high speed file transfer of very large data sets, between clusters, without the concerns of interfering with other traffic on the campus backbone. This dedicated connection also guarantees a predictable latency between the clusters
=== Grid Computing ===
=== Grid Computing ===
There are two groups on campus working in the area of grid computing.  The first group is lead by Professor Bangalore (NS&M/CIS), which focuses on basic research in grid computing, distributed computing, and web-based computing within the Collaborative Computing Laboratory (CCL). The second group is within UAB IT, which has developed an applied grid computing project, known as UABgrid that has been developed during the past 4 years as a result of UAB participation in the NSF/SURA NMI Testbed project (D.L. Shealy, PI).  UABgrid is the campus infrastructure for computation and collaboration in the grid environment. During FY07, new functionality will be added to include the Shibboleth-based identity management capability, which facilitates external research collaborations, and a meta-scheduler, which allows scheduling of HPC jobs over multiple clusters.  Both of these two new capabilities of UABgrid are being demonstrated in collaboration with UAB CCL and SURAgrid and presented at Internet2 and grid conferences during 2007.
* '''Middleware Tools:''' Tools available through UABGrid include:  Globus Toolkit, GridShib for Globus, Ganglia, GridWay metascheduler, MyProxy, UABGrid CA, GridSphere portal, Shibboleth, myVoc box management node, and GridFTP.
=== High Performance Computing ===
=== High Performance Computing ===
UAB Shared High Performance Computing Facility provides UAB-wide shared software and hardware infrastructure and support for the high performance parallel and distributed computing, numerical tools and information technology-based computing environments, and computational simulation to UAB researchers.  The facility now a joint IT and multi-school use, supported and funded initiative initially jump started by the School of Engineering, in collaboration with the Schools of Medicine and Public Health.  The current HPC combined performance of the facility is about 2.2 Teraflops. The facility is equipped with the following:
UAB Shared High Performance Computing Facility provides UAB-wide shared software and hardware infrastructure and support for the high performance parallel and distributed computing, numerical tools and information technology-based computing environments, and computational simulation to UAB researchers.  The facility now a joint IT and multi-school use, supported and funded initiative initially jump started by the School of Engineering, in collaboration with the Schools of Medicine and Public Health.  The current HPC combined performance of the facility is about 2.2 Teraflops. The facility is equipped with the following:

Revision as of 05:04, 28 November 2007

HPC Services Plans

Mission

HPC Services is the division within the IT Infrastructure Services organization with a focus on HPC support for research and other HPC activities. HPC Services support includes HPC Cluster Support, Networking & Infrastructure, Middleware, and Academic Research Support. By Research, it is meant specifically to assist or collaborate with grant activities that require IT resources. In addition, it may also include acquiring and managing high performance computing resources, such as Beowulf clusters and network storage arrays. HPC Services participates in institutional strategic planning and self-study as related to academic IT. HPC Services represents the Office of Vice-President of Information Technology to IT-related academic campus committees, regional / national technology research organizations and/or committees as requested.

Note: The term HPC is used to mean high performance computing, which has many definitions available on the web. At UAB, HPC generally refer to “computational facilities substantially more powerful than current desktops computers (PCs and workstations) …by an order of magnitude or better.” See http://parallel.hpc.unsw.edu.au/rks/docs/hpc-intro/node3.html for more description of this usage of HPC.

HPC Project Five Year Plan as of Summer 2006

As a result of discussions between IT, CIS, and ETL to determine the best methods and associated costs to interconnect HPC clusters in campus buildings BEC and CH, a preliminary draft of scope and five year plan for HPC at UAB was prepared. In order to ensure growth and stability of IT support for research computing and to obtain wide support for academic researchers for a workable model the mission of IT Academic Computing has been revised and merged into a more focused unit within IT Network & Infrastructure Services under the name of HPC Services, which is the division within the IT Infrastructure Services. See Office of VP of IT Organization Chart.

  • Scope: Building upon the exiting UAB HPC resources in CIS and ETL, IT and campus researchers are setting a goal to establish a UAB HPC data center, whose operations will be managed by IT Infrastructure and which will include additional machine room space designed for HPC and equipped with a new cluster. The UAB HPC Data Center and HPC resource will be used by researchers throughout UAB, the UAS system, and other State of Alabama Universities and research entities in conjunction with the Alabama Supercomputer Authority. Oversight of the UAB HPC resources will be provided by a committee made up of UAB Deans, Department Heads, Faculty, and the VPIT. Daily administration of this shared resource will be provided by the Department of Network and Infrastructure Services.
  • Integrate the design, construction, and staffing of an HPC Data Center with overall IT plans.
  • Secure funding for a new xxxxTeraFlop HPC Cluster. For example, HPCS will continue working with campus researchers in submitting proposals.
  • Preliminary Timeline
    • FY2007: Rename Academic Computing, HPCS, and merge HPCS with Network and Infrastructure, to leverage the HPC related talents, and resources of both organizations.
    • FY2007: Connect existing HPC Clusters to each other and 10Gig backbone.
    • FY2007: Bring up pilot grid identity management system – GridShib (HPCS, Network/Services)
    • FY2007: Enable Grid Meta Scheduling (HPCS, CIS, ETL)
    • FY2007: Establish Grid connectivity with SURA, UAS, and, ASA.
    • FY2007: Develop shared HPC resource policies.
    • FY2008: Increase support staff as needed by reassigning legacy Mainframe technical resources
    • FY2008: Develop requirements for expansion or replacement of older HPC’s. xxxxTeraFlops.
    • FY2008: Using HPC requirements (xxxx TeraFlops) for Data Center Design, begin design of HPC Data Center.
    • FY2009: Secure Funding for new HPC Cluster xxxxTera Flops
    • FY2010: Complete HPC Data Center Infrastructure.
    • FY2010: Secure final funding for expansion or replacement of older HPC’s.
    • FY2011: Procure and deploy new HPC cluster. xxxxTeraFlops.

HPC Services Goals and Accomplishments for FY2007

Goals for FY2007

  • GOAL 1: UAB Grid Computing Project
    • Bring up pilot of grid identity management based on using GridShib software which incorporate Shibboleth in the core grid software Globus;
    • Enable a grid meta-scheduling capability in collaboration with CIS and ETL so that UAB users will see a single interface for submission of HPC jobs running on primary clusters in ETL and CIS;
    • Explore expanding the campus model for HPC to other campuses of UA System and to the Alabama Supercomputing Center.
  • GOAL 2: InCommon / Shibboleth Project
    • Work with Infrastructure and Network Services to coordinate new and expanding campus applications using Shibboleth;
    • Evaluate establishing a second pilot Shibboleth application with other members of InCommon;
    • Establish UAB grid as a UAB application offered to InCommon members; and
    • Evaluate establishing pilot Shibboleth applications as an advanced technology demonstration of capabilities for inter-institutional user authentication and authorization for access to common workspace supporting calendar, document sharing, data sharing, and communication technologies for desktop.
  • GOAL 3: Participation in External IT Groups within Alabama, Region and US, such as, UA System Collaborative Technology activities, Alabama Regional Optical Network, Internet2, SURA grid, EDUCAUSE, Global Grid Forum, and Super-Computing

Accomplishments for FY2007

  • GOAL 1: UAB Grid Computing Project
    • Bring up pilot of grid identity management based on using GridShib software which incorporate Shibboleth in the core grid software Globus;
      • IdM equipment order and operational, May 9, 2007
      • GridShib installed - May 25, 2007
      • UABgrid Login sevice operational – June 19, http://uabgrid.uab.edu/login
      • UABgrid VO management service operational - target July 1
      • UABgrid GridShib CA migration operational - target July 17
    • Enable a grid meta-scheduling capability in collaboration with CIS and ETL so that UAB users will see a single interface for submission of HPC jobs running on primary clusters in ETL and CIS;
      • SURA talk and demonstration – The GridWay meta-scheduler and an example research application, DynamicBLAST, was demonstrated to the SURAgrid all-hands mtg in collaboration with CIS
      • UABgrid meta-scheduler operation - target July 17
      • UABgrid Boot Camp being scheduled for mid-August
    • Explore expanding the campus model for HPC to other campuses of UA System and to the Alabama Supercomputing Center.
  • GOAL 2: InCommon / Shibboleth Project
    • Work with Infrastructure and Network Services to coordinate new and expanding campus applications using Shibboleth;
    • Evaluate establishing a second pilot Shibboleth application with other members of InCommon;
    • Establish UAB grid as a UAB application offered to InCommon members; and
    • UABgrid Incommon Application draft has been circulated for reviews and comments.
    • Evaluate establishing pilot Shibboleth applications as an advanced technology demonstration of capabilities for inter-institutional user authentication and authorization for access to common workspace supporting calendar, document sharing, data sharing, and communication technologies for desktop.
    • This is the research collaboration focus of UABgrid
  • GOAL 3: Participation in External IT Groups within Alabama, Region and US, such as, UA System Collaborative Technology activities, Alabama Regional Optical Network, Internet2, SURA grid, EDUCAUSE, Global Grid Forum, and Super-Computing
    • List all meetings attended since Oct 1, 06: SC06, Internet2 Fall 06, SURAgrid All Hands (march), Internet2 Spring 07l
    • SURAgrid Goverance: John-Paul Robinson has been elected to serve a one-year term on the inaugural SURAgrid GC
    • SURAgrid working group: John-Paul Robinson is serving on accounting systems working group
    • CI-Team proposals: David L Shealy was a senior scientist of the large collabortive proposal submitted to NSF by Texas Tech University to present 3 two day workshops on grid computing
    • UAB Research Computing plans
    • Developed IT CyberInfrastructure presentation for ASA campus visit on April 3, 2007
    • Circulated IT research computing planning draft to the Office of VP of Research and Economic Development

Research Computing Web Pages

Campus Network

  • On Campus High Speed Network Connectivity The core of the campus network is a centrally-managed backbone comprised of ring protected enterprise-class 10-Gigabit Ethernet routers, supporting IP, IP Multicast, IPX, and Appletalk protocols. All buildings on campus are connected to one of three communication hubs using optical fiber. Within buildings, Category 5 or higher unshielded twisted pair wiring connects desktops to the network. A Gigabit Ethernet building backbone over multimode optical fiber is used for multifloor buildings. Computer server clusters are connected to the building entrance using Gigabit Ethernet. Each floor contains one or more switches connected to the building backbone using Gigabit Ethernet. Desktops are connected at 10 or 100 megabits/second speed (gigabit available when needed).
  • UAB is a charter member of Internet2 and hosts one the nodes of the Gulf Central GigaPoP and the Alabama Research and Education Network (AREN). The UA System (UA, UAB, UAH) share two OC-3s of an OC12 link of bandwidth from Birmingham to Atlanta to connect to Southern Cross Roads for I2 connectivity. The Alabama Regional Optical Network (ARON) is a dedicated (dark-fiber) dense wavelength division multiplexed (DWDM) network currently under construction and scheduled for completion in 2007. Owned and operated by the University of Alabama System, and contract agreement with Georgia Tech and the Southern Light Rail (SLR), ARON connects the University of Alabama’s three research institutions to the National LambdaRail (NLR) and will replace the current Internet2 connections. UAB NLR connectivity is expected by late 2007.

Research Network

The UAB Research Network is currently a dedicated 10GE optical connection between Shared HPC Facility and Computer Science HPC Lab which will leverage network for staging grid-based compute jobs and allow direct connection to high-bandwidth regional networks. This network allows very high speed secure connectivity between the existing HPC clusters in Engineering and Computer science as well as high speed file transfer of very large data sets, between clusters, without the concerns of interfering with other traffic on the campus backbone. This dedicated connection also guarantees a predictable latency between the clusters

Grid Computing

There are two groups on campus working in the area of grid computing. The first group is lead by Professor Bangalore (NS&M/CIS), which focuses on basic research in grid computing, distributed computing, and web-based computing within the Collaborative Computing Laboratory (CCL). The second group is within UAB IT, which has developed an applied grid computing project, known as UABgrid that has been developed during the past 4 years as a result of UAB participation in the NSF/SURA NMI Testbed project (D.L. Shealy, PI). UABgrid is the campus infrastructure for computation and collaboration in the grid environment. During FY07, new functionality will be added to include the Shibboleth-based identity management capability, which facilitates external research collaborations, and a meta-scheduler, which allows scheduling of HPC jobs over multiple clusters. Both of these two new capabilities of UABgrid are being demonstrated in collaboration with UAB CCL and SURAgrid and presented at Internet2 and grid conferences during 2007.

  • Middleware Tools: Tools available through UABGrid include: Globus Toolkit, GridShib for Globus, Ganglia, GridWay metascheduler, MyProxy, UABGrid CA, GridSphere portal, Shibboleth, myVoc box management node, and GridFTP.

High Performance Computing

UAB Shared High Performance Computing Facility provides UAB-wide shared software and hardware infrastructure and support for the high performance parallel and distributed computing, numerical tools and information technology-based computing environments, and computational simulation to UAB researchers. The facility now a joint IT and multi-school use, supported and funded initiative initially jump started by the School of Engineering, in collaboration with the Schools of Medicine and Public Health. The current HPC combined performance of the facility is about 2.2 Teraflops. The facility is equipped with the following:

  • IBM BlueGene L cluster with 2048 700 MHz processors with 512 MB of memory in each. The system has 13 terabytes of storage. This cluster should benchmark at 4.5 to 5 Teraflops.
  • DELL Xeon 64-bit Linux Cluster (CHEAHA) which consists of 128 nodes of DELL PE1425 computer, with dual Xeon 3.6GHz processors with either 2GB or 6GB of memory per node. It uses a Gigabit Ethernet inter-node network connection. There are 4 Terabytes of disk storage available to this cluster. This cluster is rated at more than 1.1 Teraflops computing capacity.
  • Verari Opteron 64-bit Linux Cluster (COOSA) which is a 64-node computing cluster consisting of dual AMD Opteron 242 processors, with 2GB of memory each node. Each node is interconnected with a Gigabit Ethernet network.
  • IBM Linux Cluster (CAHABA) is a highly scalable Linux cluster solution for high performance and commercial computing workloads. It is constructed with IBM x335 Series with a total of 128-processor (64 nodes, dual Xeon 2.4GHz, 2 to 4GB memory each node) and 1 Terabyte storage unit. Each node is interconnected with Gigabit network.
  • Supermicro Xeon 32-bit Linux Cluster which is a 10-node visualization cluster consisting of Supermicro computers with dual Xeon 2.4GHz processors, 2GB of memory each node and 3-Terabytes of accumulative disk space.
  • DNP Holo Screen Display (60”), a transparent display which allows viewers to look at and see through the screen and makes the image appear suspended in mid-air. It gives an impression of almost-3D depth.
  • Passive Stereoscopic Display System (VisBox), which is a one-wall, fully integrated, projection-based VR system with head-tracking and stereo display. The screen is 10 feet diagonal, which makes it significantly more immersive than other much more expensive systems. The VisBox uses high-end LINUX PCs and bright projectors. The footprint of a VisBox is 8’x8’, and it is a few inches shy of being 8 feet tall, making it close to an 8’x8’x8’ cube. With this system, researchers can visualize their data in a stereoscopic virtual environment. This display system is a passive stereo display system in an all-in-one unit with 2 polarizing LCD projectors and 2 mirrors, precision-mounted in a custom frame. A Linux PC drives this system with a high-end dual-headed graphics card. Users wear lightweight, inexpensive polarized eyeglasses and see a stereoscopic image.
  • Tiled Display Wall System (VisWall) (8'x8' and 3x3 configurations) is capable of a combined screen resolution of 3000x2300 pixels. It provides researchers with a display solution to visualize data or images at an ultra-high resolution. A high-end dual-processor LINUX cluster and nVidia graphics cards are used to drive the graphics applications. This is a scalable solution, which means that we can expand the number of tiles to m x n to increase the combined resolution as the budget permits. A 10-node dual processor Linux cluster drives this nine-tile visualization wall. The software synchronizes images at the tile interface. This provides an ultra-resolution visualization capability for very large-scale images/data. A Linux PC console communicating through high-speed Myrinet network drives the VisWall. Each computer is connected to a projector that contributes 1024x768screen resolution in the overall projection area.

Off-campus Resources

  • ASA/AREN
  • Internet2/NLR
  • Alabama RON
  • SURA

=== Tools and Support