The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One...

8
The Community Soſtware Repository from XSEDE: A Resource for the National Research Community JP Navarro Argonne National Laboratory MCS Division, TCS Building 240 Lemon, IL 60439 [email protected] Craig A. Stewart Indiana University Pervasive Technology Institute 2709 E. 10th Street Bloomington, IN 47408 [email protected] Richard Knepper Indiana University Pervasive Technology Institute 2709 E. 10th Street Bloomington, IN 47408 [email protected] Lee Liming University of Chicago 5801 South Ellis Avenue Chicago, IL 60637 [email protected] David Lifka Cornell Center for Adv. Computing 2709 E. 10th Street Bloomington, NY 14853 [email protected] Maytal Dahan Texas Advanced Computing Center University of Texas Austin, TX 78758 [email protected] ABSTRACT The Extreme Science and Engineering Discovery Environment (XSEDE) connects cyberinfrastructure (CI) resources, software, and services. One of XSEDE’s primary goals in supporting US research generally is to “advance the ecosystem” - making use of XSEDE’s leadership position to create software, tools, and services that lead to an effective and efficient national cyberinfrastructure. Software enables this endeavor in two very distinct ways: enabling the oper- ation of XSEDE as a distributed yet integrated cyberinfrastructure resource; and by providing access to a wide variety of software of value to end user researchers and students, operators of campus cyberinfrastructure resources, and to those considering to propose new cyberinfrastructure resources to the National Science Founda- tion (NSF). The Community Software Repository (CSR) provides transparency about how XSEDE operates and provides access to software of use and value to the US research community generally. The CSR provides access to use cases that describe needs expressed by the research community, capability delivery plans that describe how XSEDE meets those needs, and the actual software that meets those needs. Software is delivered in a variety of forms and formats. The CSR also includes mechanisms for interaction between XSEDE staff, software developers, and the end user community to acceler- ate meeting of community needs and aid software developers in finding audiences for their software. XCI’s long term goal is that the XSEDE Community Software Repository will be widely used and valuable to the national research community. CCS CONCEPTS Software and its engineering Software creation and man- agement ; Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. PEARC17, July 09-13, 2017, New Orleans, LA, USA © 2017 Copyright held by the owner/author(s). Publication rights licensed to Associa- tion for Computing Machinery. ACM ISBN 978-1-4503-5272-7/17/07. . . $15.00 https://doi.org/10.1145/3093338.3093373 KEYWORDS Software requirements, engineering activities, priorities, technical reviews, quality assurance, distribution, publishing, discovery, and collaboration 1 INTRODUCTION The Extreme Science and Engineering Discovery Environment (XSEDE) [9] is currently the largest single cyberinfrastructure grant award funded by the National Science Foundation. XSEDE, feder- ated service providers (SPs) [12], and software partners integrate software tools that enable users to access and use integrated cy- berinfrastructure resources to accelerate open US research. One of XSEDE’s primary goals in supporting US research generally is to “advance the ecosystem” - making use of XSEDE’s leadership position within the US cyberinfrastructure community to create software, tools, and services that lead to an effective and efficient national cyberinfrastructure. One part of XSEDE - the XSEDE Cy- berinfrastructure Integration (XCI) team - is particularly focused on creating and disseminating software tools that enable the entire US cyberinfrastructure community to be as interoperable as possible, while providing core services needed to make XSEDE and cyberin- frastructure resources integrated and interoperable. In particular, the mission of the XCI team is to integrate, adapt, and disseminate software tools and related services across the national CI commu- nity, building on and improving upon the efforts of XSEDE to enable the creation of an integrated national cyberinfrastructure. The XCI team activities are a response to much more than goals set by XSEDE, or by the NSF for XSEDE. Our activities respond to two sets of critical community needs: XSEDE operations trans- parency, and community needs for better software to operate local cyberinfrastructure resources. A first critical need is for the software and services to be well designed and well documented in a clear, transparent, publicly accessible fashion. XSEDE represents a major investment in cyber- infrastructure by the NSF. XSEDE as it currently exists will come to an end roughly four years after the publication of this paper. It is im- portant for XSEDE to openly document the software on which this organization operates, so that those who wish to respond to NSF

Transcript of The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One...

Page 1: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

The Community Software Repository from XSEDE: A Resourcefor the National Research Community

JP NavarroArgonne National Laboratory

MCS Division, TCS Building 240Lemon, IL 60439

[email protected]

Craig A. StewartIndiana University Pervasive

Technology Institute2709 E. 10th Street

Bloomington, IN [email protected]

Richard KnepperIndiana University Pervasive

Technology Institute2709 E. 10th Street

Bloomington, IN [email protected]

Lee LimingUniversity of Chicago5801 South Ellis Avenue

Chicago, IL [email protected]

David LifkaCornell Center for Adv. Computing

2709 E. 10th StreetBloomington, NY [email protected]

Maytal DahanTexas Advanced Computing Center

University of TexasAustin, TX 78758

[email protected]

ABSTRACTThe Extreme Science and Engineering Discovery Environment(XSEDE) connects cyberinfrastructure (CI) resources, software, andservices. One of XSEDE’s primary goals in supporting US researchgenerally is to “advance the ecosystem” - making use of XSEDE’sleadership position to create software, tools, and services that leadto an effective and efficient national cyberinfrastructure. Softwareenables this endeavor in two very distinct ways: enabling the oper-ation of XSEDE as a distributed yet integrated cyberinfrastructureresource; and by providing access to a wide variety of software ofvalue to end user researchers and students, operators of campuscyberinfrastructure resources, and to those considering to proposenew cyberinfrastructure resources to the National Science Founda-tion (NSF). The Community Software Repository (CSR) providestransparency about how XSEDE operates and provides access tosoftware of use and value to the US research community generally.The CSR provides access to use cases that describe needs expressedby the research community, capability delivery plans that describehow XSEDE meets those needs, and the actual software that meetsthose needs. Software is delivered in a variety of forms and formats.The CSR also includes mechanisms for interaction between XSEDEstaff, software developers, and the end user community to acceler-ate meeting of community needs and aid software developers infinding audiences for their software. XCI’s long term goal is thatthe XSEDE Community Software Repository will be widely usedand valuable to the national research community.

CCS CONCEPTS• Software and its engineering → Software creation and man-agement;

Publication rights licensed to ACM. ACM acknowledges that this contribution wasauthored or co-authored by an employee, contractor or affiliate of the United Statesgovernment. As such, the Government retains a nonexclusive, royalty-free right topublish or reproduce this article, or to allow others to do so, for Government purposesonly.PEARC17, July 09-13, 2017, New Orleans, LA, USA© 2017 Copyright held by the owner/author(s). Publication rights licensed to Associa-tion for Computing Machinery.ACM ISBN 978-1-4503-5272-7/17/07. . . $15.00https://doi.org/10.1145/3093338.3093373

KEYWORDSSoftware requirements, engineering activities, priorities, technicalreviews, quality assurance, distribution, publishing, discovery, andcollaboration

1 INTRODUCTIONThe Extreme Science and Engineering Discovery Environment(XSEDE) [9] is currently the largest single cyberinfrastructure grantaward funded by the National Science Foundation. XSEDE, feder-ated service providers (SPs) [12], and software partners integratesoftware tools that enable users to access and use integrated cy-berinfrastructure resources to accelerate open US research. Oneof XSEDE’s primary goals in supporting US research generally isto “advance the ecosystem” - making use of XSEDE’s leadershipposition within the US cyberinfrastructure community to createsoftware, tools, and services that lead to an effective and efficientnational cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly focused oncreating and disseminating software tools that enable the entire UScyberinfrastructure community to be as interoperable as possible,while providing core services needed to make XSEDE and cyberin-frastructure resources integrated and interoperable. In particular,the mission of the XCI team is to integrate, adapt, and disseminatesoftware tools and related services across the national CI commu-nity, building on and improving upon the efforts of XSEDE to enablethe creation of an integrated national cyberinfrastructure.

The XCI team activities are a response to much more than goalsset by XSEDE, or by the NSF for XSEDE. Our activities respondto two sets of critical community needs: XSEDE operations trans-parency, and community needs for better software to operate localcyberinfrastructure resources.

A first critical need is for the software and services to be welldesigned and well documented in a clear, transparent, publiclyaccessible fashion. XSEDE represents a major investment in cyber-infrastructure by the NSF. XSEDE as it currently exists will come toan end roughly four years after the publication of this paper. It is im-portant for XSEDE to openly document the software on which thisorganization operates, so that those who wish to respond to NSF

Page 2: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

solicitations for services that come after XSEDE can propose newservices with full knowledge of the internal operations of XSEDE.And when a successor (or successors) to XSEDE is identified, thenthe sustainability and re-usability of the investment in XSEDE willdepend significantly on the documentation and clear organizationof the software that XCI has produced and which serves as thefoundation for the operations of XSEDE.

A second need has to do with the general state of cyberinfras-tructure software in the US. A 2011 report by a taskforce of the NSFAdvisory Committee on Cyberinfrastructure [cite] issued a numberof findings, including a finding that:

The current state of cyberinfrastructure software andcurrent levels of expert support for use of cyberinfras-tructure create barriers in use of the many and var-ied campus and national cyberinfrastructure facilities.These barriers prevent the US open science and engineer-ing research community from using the existing, openUS cyberinfrastructure as effectively and efficiently aspossible.

The report also made the following strategic recommendation:

The NSF should fund activities that support the evo-lution and maturation of cyberinfrastructure throughcareful analyses of needs (in advance of creating newcyberinfrastructure facilities) and outcomes (during andafter the use of cyberinfrastructure facilities). The NSFshould establish and fund processes for collecting dis-ciplinary community requirements and planning long-term cyberinfrastructure software road maps to supportdisciplinary community research objectives.

The activities of the XCI team generally, and the CSR specifically,are one part of XSEDEâĂŹs response to this recommendation.

The most important, and publicly accessible, means by whichXSEDE and the XCI team responds to the community needs forXSEDE operations transparency and for better cyberinfrastructuresoftware is via the Community Software Repository (CSR). TheCommunity Software Repository is a software tool of value tocyberinfrastructure resources operators - whether or not they areaffiliated in any way with XSEDE; it provides services for end usersof cyberinfrastructure resources; and provides services for softwaredevelopers. (In this paper we will not discuss in depth the role ofthe Community Software Repository in supporting XSEDE itself).

XCI’s long-term goal is that the XSEDE Community SoftwareRepository will be widely used and valuable to the national researchcommunity. Our goals in this paper are to ensure that: potential con-sumers and users of software in the CSR know what is available viathe CSR; that software developers recognize the CSR as a potentialtool for disseminating their software; and that individuals and insti-tutional groups that may want to seek funding from the NationalScience Foundation for large scale Major Research Instrumentationproposals (greater than $1M total budget) [5] or proposals for alarge-scale HPC acquisitions for cyberinfrastructure resources tobe integrated with XSEDE can discover the software and servicesthat will enable integration. Of these various constituencies, wethink that the people who will find the most immediate value inthe CSR are end-user researchers and students, administrators of

campus cyberinfrastructure resources, and potential proposers tothe NSF MRI solicitation.

2 XSEDE CYBERINFRASTRUCTUREINTEROPERABILITY STRATEGY

XSEDE creates very little software of its own. The XCI teamâĂŹsstrategy in particular is to adapt and integrate pre-existing softwareto meet the needs of the US research community. We use whatwe believe to be current best practices to carry out this mission,involving:

• Documenting use cases. We use a very simple template todocument the needs of end users, CI resource managers,and software developers. We developed this simple templateusing expertise shared with us by the Systems EngineeringInstitute of Carnegie Mellon University [1].

• Documenting software capability gaps. Given use cases, wefind the best software available to meet these use cases andthen document the gaps between the current state of thesoftware and the functionality that is needed to have suchsoftware used either within XSEDE or as part of the widerUS research cyberinfrastructure community [4]. This is doneusing Capability Delivery Plans (CDPs), which are concisedocuments that describe what aspects of a Use Case are sup-ported by a particular infrastructure, which aspects are notsupported, and an infrastructure projectâĂŹs plans to fillthose gaps. Infrastructures have to periodically prioritizewhich use case support gaps they are going to work on. TheCSR includes the capability to survey stakeholder communi-ties for their input on which use cases should be prioritizedto drive engineering effort to address capability gaps.

• Set priorities. To ensure that we focus on the most valuableuse cases within our budget and staff constraints, we workwith key stakeholders to prioritize our activities that deliversoftware thru the CSR and that deliver services to XSEDE op-erators. Stakeholders that provide input include the XSEDEUser Requirements Evaluation and Prioritization (UREP), theSP Forum, and XSEDE Campus Champions.

• Security and Quality Assurance. We involve stakeholdersand experts to review software designs and security and toconduct quality assurance testing.

• Share software.Wework to share software, primarily throughthe Community Software Repository, so that members of theUS cyberinfrastructure and scientific communities can easilyfind and use software that will help them do their research.

• Provide documentation. We work with the community as awhole, software developers, and other XSEDE staff to providesoftware documentation and training and through the courseof the four remaining years of current XSEDE funding, atleast, provide support for software deployment and use.

The CSR enables software sharing and discovery at all life-cyclestages: software that is operational and ready to use, software thatfirst needs to be built before being installed, and software underdevelopment or available in beta form. We also strive to enabledistribution of software in multiple forms and formats: in a conven-tional package, as a build recipe, as a relocatable RPM, in a virtual

2

Page 3: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

machine (VM) or container, as an operational SaaS, or installed andready for use on XSEDE-supported cyberinfrastructure resources.

To help the community adopt the CSR web portal and to evolve itto address the communityâĂŹs challenges we engage face-to-facewith software developers, integrators, operators, and consumersthrough relevant events and meetings, and will provide documen-tation and training on using the CSR. Our services are constantlyevolving as we attempt to understand how to best encourage adop-tion [e.g. [3] [10]].

3 CONTENTS OF THE COMMUNITYSOFTWARE REPOSITORY

There are many software management repositories and tools inuse in the community that satisfy project-specific, institutional,research collaboration, and other needs. It is not our intent to du-plicate or replace other existing repositories. The CSR is not areplacement for these community-specific repositories and tools.Instead, the CSR offers a public location to share research CI soft-ware information in a consistent format that enables discoveryacross the broadest possible community. The CSR provides the abil-ity to reference external software repositories and catalogs wheremore specialized or community-specific software information isavailable. This will enable users who discover software through theCSR to navigate to the repositories that CSR content came from. Allthe software that XSEDE integrates for federated SPs, campuses,and users is discoverable through the CSR. Our particular foci inthe early implementation of CSR has been the following:

• Software critical to the operation of XSEDE and to the oper-ators of cyberinfrastructure resources that are interoperablewith and allocated via XSEDE;

• Software for end users• Software for operators of cyberinfrastructure resources thatare NOT in any way integrated with XSEDE.

Note that we use the term cyberinfrastructure resource in thisdocument to refer to a wide variety of resources ranging from clus-ters, supercomputers, storage systems, science gateways, and evensupport services. The information in the CSR is of use to such cy-berinfrastructure operators whether they have a relationship withXSEDE or not. Operators of such cyberinfrastructure resources thatare integrated with XSEDE are called Service Providers in XSEDEterminology. (These organizations are members of the XSEDE Fed-eration [12]; such membership is how these organizations comeinto relationship with XSEDE).

Information about the software enabling these capabilities is pro-vided in the CSR. Using this information, XSEDE service Providers,campus-based cyberinfrastructure resource providers that mayhave no affiliation with XSEDE, and other community memberscan learn about these software tools and deploy them on their ownsystems. The CSR includes all the software tools that XSEDE hasintegrated and made available in packaged formats for communitymembers.

All well managed software projects propose, prioritize, prepare,and deliver new software products or enhancements to existingproducts. The CSR includes information about XSEDE softwareintegration activities and how they map to use cases, packaged

Figure 1: A graphical depiction of the categories of softwareCSR provides for use by the US research community.

software, and operational software. The CSR contains three typesof software information: global, packaged, and operational:

• Global Software Information is information that does notchange based on packaging format and operational status,such as name, textual description, global categories/tags, andvendor information.

• Packaged Software Information is information about soft-ware that requires action to make it operational. Packagesmay be in VM, container, RPM, tar, build recipe, or other for-mat that can be instantiated. Package information includesrepository pointers; installation, build, and provisioning in-structions; format; and package support contact information.

• Operational Software Descriptions is information aboutâĂIJready-to-useâĂİ operational software available by com-mand line (on an SP resource[2] or through a network in-terface (SaaS, portal, gateway, etc.). Information includeshow to access and use the software and operational supportcontact information.

A schematic representation of the Community Software Reposi-tory is shown in Figure 1 and available online [13].

Each entry in the repository includes a description of a use case;a capability delivery plan (once one has been developed); and forsoftware that has been prioritized and/or delivered, links to theimplementation plans or operational components. In general thereare also links to documentation and for discussion of particularsoftware needs. This information is available to all communitymembers and can be used to identify new features and services thatmight be of value to the community. As of the writing of this report,the CSR includes a total of 96 completed use case documents.

XSEDE has multiple goals set for itself and set by the NSF. Oneof them is to run core XSEDE functions: supporting the serviceproviders funded by the NSF that are required to integrate with

3

Page 4: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

XSEDE, allocating those resources, and managing account accessand tracking of usage on each of these service providers. The puz-zle pieces labeled XSEDE Enabling Functions and XSEDE accountmanagement lead to information about software that meets theseneeds and is thus at the core of the operation of XSEDE. Yet XSEDEis explicitly intended by the NSF to be open and extensible. Thematerial in the XSEDE Enabling Functions and XSEDE AccountManagement groups shown in Figure 1 are, in the end, potentiallyuseful to any organization operating a distributed or federatedcyberinfrastructure resource. Over time, we plan to add generalversions of the information included in these hierarchies, so thatthese categories of software might be more generally be describedas “Distributed CI Enabling Functions” and “Distributed CI AccountManagement.”

For the moment, software of interest and use to the researcher orstudent end user is most likely described in the Scientific Computingand Scientific Data elements of the CSR hierarchy. The Cyberinfras-tructure Resource Operation part of the hierarchy is of greatest useand interest to people operating CI resources ranging from sciencegateways to supercomputers.

XSEDE gives community members access to a wide range ofsoftware. This software includes compilers, applications libraries,development and debugging tools, and other tools that enable usersto make effective use of CI. With these tools, for example, userscan authenticate and manage security credentials, login interac-tively to resources, move data between distributed resources withinand outside of XSEDE, execute jobs or provision VMs, account forallocations usage.

The CSR provides access to all the software tools that XSEDEhas integrated and made available in packaged formats for com-munity members. These software tools enable activities by endusers, cyberinfrastructure service operators, software operatorsand integrators, and software developers.

3.1 Enabling End UsersIndividual end users - scientists and students - looking at XSEDEface what may appear to be a large and complex organization.The CSR provides a simple way for an individual end user to findsoftware that meets a current (or future) need, download it, anduse it. The CSR offers such end users tools to:

• identity software that supports specific user needs;• find associated software documentation;• navigate to software repositories to download, install, anduse the software

• access the systems where software has already been installedFor end users, we offer less by way of tools for discussing needs

than we offer to other communities of users of the CSR, becausesuch needs discussions are addressed through large scale surveysof all users of XSEDE, or through short-term micro surveys.

As one very widely used example, the CSR offers multiple waysto download and install the Globus Transfer personal endpointsoftware. This is software usable on any personal workstation -from laptop to beefy multiprocessor deskside system - and alsofrom within a VM - to easily move files from a local workstation(or VM) to any other cyberinfrastructure resource on which the

user has âĂIJwriteâĂİ privileges. This is one of the most commonlydownloaded tools currently available via the CSR.

3.2 Enabling Cyberinfrastructure ServiceOperators

For the national community of cyberinfrastructure service resourcesthe CSR offers the following:

• Sharing requirements and use cases that are enabled by theirsoftware

• Identifying use case support gaps• Providing XSEDE input on which gaps have the highestpriority

• Navigate to software repositories of software for operationof cyberinfrastructure resources. Obtain software, download,install, and use!

• Publishing the availability of packaged software in any form.• Publishing the availability of operational ready-to-use soft-ware in any form.

• Engaging in open discussion on software requirements, pri-orities, deployment, and use of software for operating cyber-infrastructure resources.

This part of CSR supports anyone from the lone (and often lonely)cluster administrator at a small institution to large groups of peo-ple creating and managing science gateways. Three of the criticalsoftware tools available via the CSR are described below.

The XSEDE-Compatible Basic Cluster (XCBC) software toolkitenables campus CI resource administrators to build a local clus-ter from scratch, which is then easily interoperable with XSEDE-supported CI resources. XCBC is very simple in concept: pull thelever, have a cluster built for you complete with an open sourceresource manager / scheduler and all of the essential tools neededto run a cluster, and have those tools set in place in ways that mimicthe basic setup on an XSEDE-supported cluster. The XCBC is basedon the OpenHPC project [6], and consists of XSEDE-developed An-sible playbooks and templates designed to ease the work requiredto build a cluster.

The XSEDE National Integration Toolkit (XNIT). Suppose youalready have a cluster that you are happy with and you want toadd too it software tools that will allow users to use open sourcessoftware like that on XSEDE, or other particular pieces of softwarethat you think are important, but you donâĂŹt want to blow upyour cluster to add that capability? XNIT is for you. You can add allof the basic software that is in XCBC, as relocatable RPMs (ResourcePackage Manager), via a YUM repo. (YUM Stands for YellowdogUpdater, Modified). The RPMs in XNIT allow you to expand thefunctionality of your cluster, in ways that mimic the setup on anXSEDE cluster. XNIT packages include specific scientific, mathe-matical, and visualization applications that have been useful onXSEDE systems. Systems administrators may pick and choose whatthey want to add to their local cluster; updates may be configuredto run automatically or manually. Currently the XNIT repository isavailable for x86-64 systems running CentOS 6 or 7.

Apache Airavata Science Gateway Suite. Apache AiravataTMis a software framework that enables you to compose, manage,execute, and monitor large scale applications and workflows on

4

Page 5: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

distributed computing resources such as local clusters, supercom-puters,computational grids, and computing clouds. If you wouldlike to create a science Gateway, the Apache Airavata is an excel-lent tool and one commonly used by XSEDE-supported ScienceGateways.

3.3 Enabling Software Operators andIntegrators

For the software operators and integrators community the CSRprovides the following:

• Discovering software that enables use cases that they wantto support.

• Discovering resource integration, federation, and interoper-ability options enabled by software [8].

• Discovering the availability of packaged software in anyform that they can adopt and deploy.

• Publishing the availability of operational ready-to-use soft-ware in any form (which they may have discovered in theCSR from a developer).

• Engaging in open discussion about requirements, priorities,deployment, or use of software.

Software operators and integrators can pick a-la-carte whichof these CSR capabilities they want to leverage. They can consultwith the XCI team on how to effectively use the CSR capabilitiesthey choose and how those capabilities could be enhanced to bettersupport their software sharing need. For example, the initial goalof XUP was to provide the XSEDE community and users compre-hensive software information. Using the CSR API the XUP teamdeveloped an easy-to-use user interface to search for XSEDE soft-ware across sites. The initial interface enables to gather automatedsoftware information from CSR and to add more static metadatavia an administrative interface. Users have the ability to search thesoftware catalog and filter by science domain, service provider, orspecific system. The XUP team will continue to work alongside CSRto continue to promote and offer software and service informationto the end-user.

3.4 Enabling Software DevelopersFor the software developers community the CSR enhances thefollowing:

• Sharing requirements and use cases that are enabled by theirsoftware

• Identifying use case support gaps• Providing XSEDE input on which gaps have the highestpriority

• Publishing the availability of packaged software in any form• Publishing the availability of operational ready-to-use soft-ware in any form

• Engaging in open discussion on software requirements, pri-orities, deployment, and use

Software developers can pick a-la-carte which of these CSRcapabilities they want to leverage. Software developers can consultwith the XCI team on how to effectively use the CSR capabilities

Figure 2: The CSR Integration Console.

they choose and how those capabilities could be enhanced to bettersupport their software sharing needs.

3.5 Enabling potential proposers who aspire tooperate a cyberinfrastructure resourcesupported and allocated by XSEDE

We would be remiss if we did not discuss resources available tothose who wish to propose the creation of a cyberinfrastructureresource funded by the NSF and supported or allocated via XSEDE.There are at least three categories of such resources now:

• Level 1 Service Providers. These are the generally larger re-sources, operated by service providers under funding andcooperative agreements from the NSF that require such re-sources to be supported and allocated via XSEDE. Such sys-tems include as of the writing of this report: Bridges, Comet,Jetstream, Stampede, and Wrangler.

• Large-scale Major Research Instrumentation (MRI) awards.The NSF MRI funding program [5] includes multiple levelsof funding as options for proposers to put forth. The solicita-tion states “Proposals requesting over $1 million should ad-dress the potential impact of the instrument on the researchcommunity of interest and at the regional or national levelwhen appropriate. For large multi-user instruments that pro-vide service beyond a single institution, concrete plans forenabling access by external users (including those from non-Ph.D. and/or minority-serving institutions) through physicalor virtual access should be presented, and the uniquenessof the requested instrument should also be described.” Sofar, two institutions have successfully proposed such large-scale MRI projects where the strategy for “enabling accessby external users” was to make the resource available withXSEDE support and allocated via existing XSEDE processes.

5

Page 6: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

Figure 3: The CSR Operational Console.

• Other large scale cyberinfrastructure providers that haveintegrated allocation processes with XSEDE, such as theNational Center for Atmospheric Research (NCAR).

To enhance the ability of cyberinfrastructure resource operatorsto follow the steps appropriate to integrate into the XSEDE environ-ment in ways appropriate for their own status, and to enable SPsand the XSEDE integration coordinator to track integration status,the CSR includes a Resource Integration Console [11] (see Fig. 2)showing completed and outstanding steps for integrating CI HPC,HTC, visualization, and storage resources into the XSEDE environ-ment. This console includes details about required and optionalsoftware that SPs can use or install to complete the integration.

To enhance the ability to track operational status, the CSR in-cludes an Operational Status Console (see Fig. 3). The OperationalStatus Console shows the declared operational status of a resource,announced outage information, monitoring results, and resourcepublishing status [7] (includes whether a resource has up-to-datesoftware availability information). These views of XSEDE and soft-ware included in the Community Software Repository and otherinformation available via the CSR should be of assistance to in-stitutions and PIs who wish to propose to the NSF funding for alarge-scale (> $1M) MRI award or for a Level 1 Service Provideraward (the most recent solicitation is [cite]; more are expected inthe future.

3.6 User satisfaction and usage of XCI servicesgenerally and the Community SoftwareRepository particularly

XCI user survey and self-assessment efforts are still in an early stage.During reporting year 1 of the second five years of XSEDE (fromSeptember 1 2016 to April 30 of 2017) we set a number of goals forXCI and its services. We set a user satisfaction goal of having anaverage user satisfaction, across all XCI services, of at least 4.0 on

a 1-5 Likert scale (where 5 = extremely satisfied). During reportingyear (RY) 1, our actual average satisfaction was 4.5. In surveys ofpeople who operate a cyberinfrastructure resource allocated by andintegrated with XSEDE, our services received an average rating of4.3. We did not set any specific goals for the Community SoftwareRepository during RY1, since simply getting it created was one ofour key goals. Some sense of the community usage, however, canbe gotten from usage of some of the software repositories accessedvia the CSR. A total of 21 Capability Delivery Plans were completedin RY1. A total of six new capabilities were created by XCI andare now delivered via the CSR. Taken together, these statisticsindicate high levels of activity meeting demands as expressed by ourvarious stakeholders and a high level of satisfaction with the toolsmade available via the CSR for integration of cyberinfrastructureresources with XSEDE.

We also have statistics regarding uptake, use, and impact of soft-ware available via CSR beyond the organizational boundaries ofXSEDE. A total of at least 594 distinct systems use one or moretoolkits available through the CSR (“systems” here includes every-thing from individual laptop computers used by researchers andstudents to the former SDSC Trestles system now relocated at theUniversity of Arkansas). All cyberinfrastructure resource operatorswhose resources are allocated via XSEDE processes use the toolsrecommended for them and available via the CSR. 99 systems sub-scribe to software updates delivered through the CSR via the XSEDENational Integration Toolkit (XNIT) repository. And an aggregateof at least 732 TeraFLOPS worth of campus-based computing clus-ters are operating with software from the XSEDE-Compatible BasicToolkit (XCBC) and/or XNIT. These statistics indicate growing im-pact of the software available via the CSR beyond the organizationalboundaries of XSEDE.

4 FUTUREWORKWhile the prioritization of activities affects our work plan on an on-going basis, the following steps have been identified as top prioritiesfor the next year of XSEDE:

• Creation of a global software descriptions repository thatincludes information about software that is independent ofits status or availability, including software description, ven-dor details, and global tags and categories. This feature willinclude web forms for entering global software descriptionsand may introduce a vetting process so that CSR adminis-trators can review and correct global software descriptionquality issues.

• Implementation of convenient means for Science GatewayOperators to manually enter operational software descrip-tions using web forms. This capability will enable sciencegateways and XSEDE itself to advertise user- and developer-facing software and service information.

• Engaging the community to add their software informationto the CSR. This will include creating mechanisms to enablecommunity software providers to automatically advertise thesoftware they have placed in XSEDE Community SoftwareAreas (CSAs).

Other activities will be added, depending upon availability ofadditional funds via the XSEDE budget, funding from sources other

6

Page 7: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

than the NSF award to operate XSEDE, and prioritization of futureactivities that can be done within the existing budget via XSEDEfor XCI.

5 CONCLUSIONSXSEDE has many goals. One of these goals is simply to operate theexisting XSEDE resource as a front end for multiple NSF-funded cy-berinfrastructure resources. To be most effective and sustainable inthe long run, this work must be done in ways that are transparent toothers and that allows others to offer their own solutions to unmetcommunity needs. The Community Software Repository is a criticalpart of XSEDEâĂŹs conveyance of information about how XSEDEoperates, and a key vehicle for feedback from the research com-munity about what XSEDE operational software priorities shouldbe. XSEDE will come to an end approximately four years after thepublication of this report. The information and software accessiblevia the Community Software Repository will make it possible forfuture cyberinfrastructure support organizations to build on thefoundation that XSEDE has created.

The CSR is a service designed to support one of XSEDEâĂŹsprimary goals - to âĂIJadvance the ecosystem.âĂİ CSR is the em-bodiment and access mechanism for all of the work done by theXSEDE Cyberinfrastructure Integration (XCI) team to integrate,adapt, and disseminate software tools and related services acrossthe national CI community, building on and improving upon theefforts of XSEDE to enable the creation of an integrated nationalcyberinfrastructure. The CSR in particular provides mechanismsfor implementation of best practices in cyberinfrastructure servicesin multiple regards. As described already, we employ best practicesin identifying needs and adapting software to meet those needs. Inaddition, the software tools that we adapt for use to fulfill needsidentified by research community stakeholders is chosen specifi-cally to be the best of breed for filling those needs.

Some of the software is of utility to (and directly installableby) researcher and student end users. Much of the software andtools provided via the CSR are of interest to people and groupsthat operate cyberinfrastructure resources. In particular, we haveput considerable focus on the needs of those who operate cyber-infrastructure resources beyond the organizational boundaries ofXSEDE.

The ACCI Campus Bridging taskforce report discussed in theIntroduction talked about the need for better cyberinfrastructuresoftware and discussed specifically the cost of needless diversity incluster management software, as well as the challenges presentedto users, support personnel, and cluster administrators in an envi-ronment where many cluster administration tasks are needlesslydone by hand rather than by automated mechanisms. Withouttrying to stamp out diversity and innovation, XNIT and XCBCspeak directly to these needs. Both tools make it easier for campus-based system administrators to manage computing cluster withcurrent best of breed software - nothing that such software changesover time. By virtue of similarities between the cluster setups inXNIT and XCBC, local support personnel can re-use documenta-tion created for XSEDE and re-purpose it to aid support of localclusters. And from the standpoint of end users, the consistency thatis created between local campus clusters and XSEDE-supported

resources makes it easier to work simultaneously with local campusresources and federally-supported resources. It is in these regardsthat the Community Software Repository is a resource that is ofinterest to the national research community generally. The CSRoffers also particularly valuable information for those institutionsnot currently managing a cyberinfrastructure resource that is allo-cated by XSEDE, but which would like to obtain federal fundingso to do. This is an important part of the transparency regardingXSEDE that the CSR creates.

Wewelcome community suggestions aboutwhat priorities shouldbe for the Community Software Repository. To submit a sugges-tion send email to [email protected] with a subject that begins with“Suggestions regarding CSR”. The XSEDE Cyberinfrastructure In-tegration (XCI) team will be best able to improve and expand theCommunity Software Repository to serve the national researchcommunity as a whole with the benefit of your suggestions!

6 ACKNOWLEDGMENTS AND LICENSETERMS

This document was developed with support from National ScienceFoundation (NSF) grant OCI-1053575. This material is based in partupon work supported by the U.S. Department of Energy, Officeof Science, under contract DE-AC02-06CH11357, as well as worksupported by the Indiana University Pervasive Technology Instituteand its funding from the Lilly Endowment, Inc. and Indiana Univer-sity. Apache Airavata is developed with the support of the IndianaUniversity Pervasive Technology Institute and by National ScienceFoundation awards ATM-0331480, OCI-0721656, OCI-1032742 andSCI-0503697. Any opinions, findings, and conclusions or recom-mendations expressed in this material are those of the author(s)and do not necessarily reflect the views of the NSF.

The submitted manuscript was created under the leadership ofUniversity of Chicago Argonne, LLC, Operator of Argonne Na-tional Laboratory (“Argonne”). Argonne, a U.S. Department of En-ergy Office of Science laboratory, is operated under Contract No.DE-AC02-06CH11357. The U.S. Government retains for itself, andothers acting on its behalf, a paid-up nonexclusive, irrevocableworldwide license in said article to reproduce, prepare derivativeworks, distribute copies to the public, and perform publicly and dis-play publicly, by or on behalf of the Government. The Departmentof Energy will provide public access to these results of federallysponsored research in accordance with the DOE Public Access Plan.http://energy.gov/downloads/doe-public-access-plan.

REFERENCES[1] Carnegie Mellon University. 2017. Systems Engineering Institute of Carnegie

Mellon University. (2017). http://www.sei.cmu.edu/[2] Matthew Hanlon, Warren Smith, and Stephen Mock. 2013. Providing Resource

Information to Users of a National Computing Center. In Proceedings of theConference on Extreme Science and Engineering Discovery Environment: Gatewayto Discovery (XSEDE ’13). ACM, New York, NY, USA, 43:1–43:8. https://doi.org/10.1145/2484762.2484826

[3] Katherine A Lawrence andNancyWilkins-Diehr. 2012. Roadmaps, Not Blueprints:Paving the Way to Science Gateway Success. In Proceedings of the 1st Conferenceof the Extreme Science and Engineering Discovery Environment: Bridging fromthe eXtreme to the Campus and Beyond (XSEDE ’12). ACM, New York, NY, USA,40:1–40:8. https://doi.org/10.1145/2335755.2335837

[4] R. Malan and D Bredemeyer. 2001. Functional Requirements and Use Cases. Tech-nical Report. http://www.bredemeyer.com/pdf_files/functreq.pdf

7

Page 8: The Community Software Repository from XSEDE: A Resource ... · national cyberinfrastructure. One part of XSEDE - the XSEDE Cy-berinfrastructure Integration (XCI) team - is particularly

[5] National Science Foundation. 2017. Major Research Instrumentation Program.(2017). https://www.nsf.gov/pubs/2015/nsf15504/nsf15504.htm

[6] OpenHPC. 2017. Community Building Blocks for HPC Systems. (2017). https://openhpc.community/

[7] Warren Smith, Sudhakar Pamidighantam, and John-Paul Navarro. 2015. Pub-lishing and Consuming GLUE V2.0 Resource Information in XSEDE. In Pro-ceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by En-hanced Cyberinfrastructure (XSEDE ’15). ACM, New York, NY, USA, 25:1–25:8.https://doi.org/10.1145/2792745.2792770

[8] Craig A. Stewart, Richard Knepper, J.W. Ferguson, Felix Bachmann, I Foster,Andrew Grimshaw, V Hazlewood, and D Lifka. 2012. What is campus bridgingand what is XSEDE doing about it?. In XSEDE12. Chicago, IL. http://hdl.handle.net/2022/14599

[9] John Towns, Tim Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, AndrewGrimshaw, Victor Hazelwood, Scott Lathrop, David Lifka, Ralph Roskies, J. RayScott, and Nancy Wilkins-Diehr. 2014. XSEDE: Accelerating Scientific Discovery.Comput. Sci. Eng. 16, October (2014), 62. https://doi.org/doi:10.1109/MCSE.2014.80

[10] V. Venkatesh, M.G. Morris, F.D. Davis, and G.B Davis. 2003. User Acceptanceof Information Technology: Toward a Unified View. MIS Quarterly 27, 3 (2003),425–478. http://csdl-techreports.googlecode.com/svn/trunk/techreports/2005/05-06/doc/Venkatesh2003.pdf

[11] XSEDE. 2016. XSEDE Service Provider Checklist. Technical Report. http://hdl.handle.net/2142/91024

[12] XSEDE. 2017. XSEDE Federation. (2017). https://www.xsede.org/xsede-federation

[13] XSEDE Community Software Repository. 2017. XSEDE Use Cases. (2017). https://software.xsede.org/xsede-use-cases

8