SG Call March 5, 2012
Application Discovery and Deployment Strategies for SURAgrid
Discovering where users can run applications on OSG resources supporting the SURAgrid VO is key to making the grid easier to use. Mapping the availability of applications across resources is a key function of the SURAgrid community.
Gabriele Garzoglio and Marko Slyz from the OSG VO forum will give an overview of software distribution practices of various VOs on OSG to help us understand options available as we build a solution for SURAgrid VO.
Steve Johnson (TAMU) and David Mathews-Morgan (UGA) will provide an over of early work on an approach to advertise site applications.
Please join us for this important community call on March 5th at 2pm Eastern and help shape SURAgrid operations to benefit the science at your campus.
- VO Application Management in OSG - Presented by Gabriele Garzoglio (FNAL) and Marko Slyz (FNAL) of the OSG VO Forum
- Preliminary SURAgrid Application Discovery - Presented by Steve Johnson(TAMU) and David Mathews-Morgan (UGA). An overview of the of the proposed solution that use's OSG's BDII database.
Conference Bridge: 800-377-8846 Pin: 1442149
Jim Lupo (LSU), Amit (ODU), David Mathews-Morgan (UGA), Steve Johnson (TAMU), Gary Crane (SURA), Art Vandenberg (GSU), Phil Smith (TTU), Jere Perez (TTU), John-Paul Robinson (UAB), Gabriele Garzoglio (FNAL), Marko Slyz (FNAL)
Discussion Summary / Notes from the Community
Gabriele Garzoglio (FNAL) presented an overview of OSG and provided guidance on a variety of solutions in place within OSG for how VO's manage their applications.
The solution range in complexity depending on the needs of the end user and the VO community.
- As basic solution is for an individual job to simple including the application within the job itself. This works well for smaller apps, less than 50-100MB in size. All the app needs to do is include an app tarball and unpack it as part of the job
- A VO-level solution could have the application maintainer group for the VO deploy an application set that supports the VO across many OSG resources. This could be done manual by targeting select sites or it coudl be an automated solution that runs "app install" jobs across all resources to install applications in the $OSG_APP area of resource.
A new tool that is proving useful for VO-level solutions is the CERN VM Filesystem (CVMFS) which packages a common application namespace across multiple sites.
The application distribution solutions can also integrate with the popular GlideWMS job distribution framework used across OSG. This framework sends probe jobs to compute resources on behalf of the VO and collects available resources into a common pool of compute nodes. This framework leverages Condor scheduling semantics and can scan a site's $OSG_APP directory for the VO to advertise the list of applications available at specific sites in order to match a job to the compute resources that support the application requirements defined by the job.
Other points raised:
- Applications should only assume only a simple base OS install, most commonly the RedHat EL5 deritives, Scientific Linux 5 and CentOS5.
- VOs should consider the level of site support for their VO in choosing an application distribution method. Typically, sites on OSG do not support VOs directly and its incumbant on the VO to maintain the applications across resources.
- We should feel welcome to contact email@example.com to answer any specific questions we have aout implementing this solution.
Steve Johnson (TAMU) presented the initial work on explore BDII as an information repository for application information at a specific site. In this scenario, sites would register their provided applications to the BDII database and then application users could query this data set to determine the resources which can support the application.
An interesting potential of an application database or common application advertising solution would be to offer site-optimized applications with the potential for some very tuned software configurations. The apps definitions could also be a virtual machine.
Steve solicited participation and encouraged engagement in refining this solution into a comprehensive solution for SURAgrid.