Year-3 Annual Report (1 October 2005 – 30 September 2006 ...kkd.ou.edu/lead_year3_public.pdf1 July...

133
1 July 2006 Year-3 Annual Report (1 October 2005 – 30 September 2006) and Year-4 Program Plan NSF Cooperative Agreements ATM03-31574, 31578, 31579, 31480, 31586, 31587, 31591, 31594

Transcript of Year-3 Annual Report (1 October 2005 – 30 September 2006 ...kkd.ou.edu/lead_year3_public.pdf1 July...

  • 1 July 2006

    Year-3 Annual Report (1 October 2005 – 30 September 2006) and

    Year-4 Program Plan

    NSF Cooperative Agreements ATM03-31574, 31578, 31579, 31480, 31586, 31587, 31591, 31594

  • 1

    Table of Contents

    1. Introduction – the Genesis and Mission of LEAD . . . . . . 6 2. System Capabilities and the LEAD Environments . . . . . 7 3. The LEAD Architecture . . . . . . . . . . . . . 8 4. The LEAD Grid . . . . . . . . . . . . . . . 12 5. Research Agenda and Strategic Timeline . . . . . . . . 13 5.1 Balancing Research and Deployment. . . . . . . . . 13 5.2 Strategic Timeline and Integrative Test Beds. . . . . . . 14 5.3 Research Agenda for Dynamic Adaptation . . . . . . . 15 5.3.1 Working Definitions and Key Questions and Issues . . 17 5.3.2 Dynamic Adaptation Use Case #1 . . . . . . . 18 5.3.3 Dynamic Adaptation Use Case #2 . . . . . . . 19 5.3.4 Dynamic Adaptation Use Case #3 . . . . . . . 20 5.4 Mapping the Research Agenda . . . . . . . . . . 20 6. Project Administration and Oversight . . . . . . . . . 22 6.1 Project Manager and Technical Integration Coordinator. . . . 22 6.2 Internal and External Assessment of the LEAD Enterprise. . . 23 6.3 Collaboration Mechanisms . . . . . . . . . . . 23 6.4 External Advisory Panel . . . . . . . . . . . 25 7. Year-3 Accomplishments Compared with Plans . . . . . . 25 7.1 LEAD Grid . . . . . . . . . . . . . . . 27 7.2 Data Thrust . . . . . . . . . . . . . . . 29 7.2.1 Components . . . . . . . . . . . . . 29 7.2.2 Integration . . . . . . . . . . . . . 31 7.3 Orchestration Thrust . . . . . . . . . . . . 34 7.3.1 Overview . . . . . . . . . . . . . 35 7.3.2 Services Development . . . . . . . . . . 36 7.3.3 Orchestration and Monitoring . . . . . . . . 39 7.3.4 Siege – A Thick Client Strategy for Workflows . . . 41 7.3.5 Documenting the LEAD Architecture . . . . . . 44 7.4 Tools Thrust . . . . . . . . . . . . . . . 44 7.4.1 ADaM . . . . . . . . . . . . . . 46 7.4.2 Decoders . . . . . . . . . . . . . 47 7.4.3 Noesis . . . . . . . . . . . . . . 48 7.4.4 Feature Detection in Streaming Data and Dynamic Workflow 49 Response 7.4.5 IDV . . . . . . . . . . . . . . . 50 7.4.6 IDD, LDM and OPeNDAP/ADDE . . . . . . . 50

  • 2

    7.4.7 Siege . . . . . . . . . . . . . . . 51 7.4.8 THREDDS . . . . . . . . . . . . . 51 7.4.9 ADAS . . . . . . . . . . . . . . 52 7.4.10 WRF . . . . . . . . . . . . . . . 52 7.5 Meteorology Thrust . . . . . . . . . . . . . 53 7.6 Portal Thrust. . . . . . . . . . . . . . . 61 7.6.1 Overview . . . . . . . . . . . . . 61 7.6.2 Portal Functionality and Experiment Construction . . . 64 7.6.3 Accomplishments . . . . . . . . . . . 67 8. Community Engagement . . . . . . . . . . . . 69 9. Inaugural Meeting of the Forum on Geosciences Information . . 71 Technology (FGIT) 10. Tri-Annual Unidata Users Workshop . . . . . . . . . 72 10.1 Introduction and Workshop Goals . . . . . . . . . 72 10.2 Application of LEAD at the Workshop . . . . . . . . 73 10.3 Preparing for the Workshop . . . . . . . . . . . 75 10.4 Capturing the User Experience . . . . . . . . . . 75 10.5 Preliminary User Feedback . . . . . . . . . . . 75 10.6 Preliminary Technical Lessons Learned . . . . . . . 76

    11. Education and Outreach . . . . . . . . . . . 76 12. Diversity Enhancement . . . . . . . . . . . . 79 13. Changes in Direction . . . . . . . . . . . . 82 14. Collaboration with Other Projects . . . . . . . . . 82 14.1 Center for Collaborative Adaptive Sensing of the . . . 82 Atmosphere (CASA) 14.2 TeraGrid . . . . . . . . . . . . . . 84 14.3 VGrADS . . . . . . . . . . . . . . 84 14.4 WRF Developmental Test Bed Center (DTC) . . . . 85 14.5 Digital Library for Earth System Education (DLESE) . . 85 14.6 NOAA National Severe Storms Laboratory and Storm Prediction 85 Center 15. Broader Impacts . . . . . . . . . . . . . . 86 15.1 Supercomputing 2006 and TeraGrid 06 . . . . . . 86 15.2 NSF Middleware Initiative . . . . . . . . . 87 16. References Cited . . . . . . . . . . . . . . 88 17. Quantitative Project Output . . . . . . . . . . 89

  • 3

    17.1 Journal Articles and Conference Proceedings Directly Related to 89 or Sponsored in Whole or in Part by LEAD 17.2 Book Chapters . . . . . . . . . . . . . 94 17.3 Related Journal Articles and Conference Proceedings . . 94 17.4 Internet Dissemination . . . . . . . . . . 96 17.5 Degrees Awarded . . . . . . . . . . . . 96 17.6 Presentations Involving LEAD . . . . . . . . 97

    18. International Activities . . . . . . . . . . . . 100 18.1 Foreign Collaborators . . . . . . . . . . . 100 18.2 Visitors . . . . . . . . . . . . . . 100 18.3 International Presentations . . . . . . . . . 100 19. Project Participants . . . . . . . . . . . . . 101 20. Year-3 Budget Analysis . . . . . . . . . . . . 110 20.1 Year-3 Budget Allocations by Institution . . . . . . 110 20.2 Expenditures by Institution. . . . . . . . . . 111

    20.2.1 University of Oklahoma. . . . . . . . . 111 20.2.2 University of Illinois. . . . . . . . . . 113 20.2.3 University of Alabama in Huntsville . . . . . . 114 20.2.4 UCAR Unidata Program . . . . . . . . . 115 20.2.5 Indiana University . . . . . . . . . . 116 20.2.6 Howard University . . . . . . . . . . 117 20.2.7 Millersville University . . . . . . . . . 118 20.2.8 Colorado State University . . . . . . . . 119 20.2.9 University of North Carolina . . . . . . . 120

    20.3 Expenditures by Project Thrust. . . . . . . . . 121 20.3.1 Entire Project. . . . . . . . . . . . 121 20.3.2 University of Oklahoma. . . . . . . . . 121 20.3.3 University of Illinois. . . . . . . . . . 122 20.3.4 University of Alabama in Huntsville . . . . . . 122 20.3.5 UCAR Unidata Program . . . . . . . . . 122 20.3.6 Indiana University . . . . . . . . . . 123 20.3.7 Howard University . . . . . . . . . . 123 20.3.8 Millersville University . . . . . . . . . 123 20.3.9 Colorado State University . . . . . . . . 124 20.3.10 University of North Carolina . . . . . . . 124

    20.4 Expenditures by Project Function . . . . . . . . 124 20.4.1 Entire Project . . . . . . . . . . . 125 20.4.2 University of Oklahoma . . . . . . . . 125 20.4.3 University of Illinois . . . . . . . . . 126 20.4.4 University of Alabama in Huntsville . . . . . 126 20.4.5 UCAR Unidata Program . . . . . . . . 126 20.4.6 Indiana University . . . . . . . . . . 127 20.4.7 Howard University . . . . . . . . . . 127

  • 4

    20.4.8 Millersville University . . . . . . . . . 127 20.4.9 Colorado State University . . . . . . . . 128 20.4.10 University of North Carolina . . . . . . . 128

    20.5 Expenditures by Personnel Classification . . . . . . 128 20.5.1 Entire Project . . . . . . . . . . . 129 20.5.2 University of Oklahoma . . . . . . . . 129 20.5.3 University of Illinois . . . . . . . . . 129 20.5.4 University of Alabama in Huntsville . . . . . 130 20.5.5 UCAR Unidata Program . . . . . . . . 130 20.5.6 Indiana University . . . . . . . . . . 130 20.5.7 Howard University . . . . . . . . . . 130 20.5.8 Millersville University . . . . . . . . . 131 20.5.9 Colorado State University . . . . . . . . 131 20.5.10 University of North Carolina . . . . . . . 131

    21. Disclosures, Licenses and Patents . . . . . . . . . 132 22. Prizes and Awards . . . . . . . . . . . . . 132 23. Images for NSF Public Relations Use . . . . . . . . 132 24. Program Plans for Year-4 . . . . . . . . . . . 132 24.1 LEAD Grid . . . . . . . . . . . . . 133 24.2 Data Thrust . . . . . . . . . . . . . 134 24.3 Orchestration Thrust . . . . . . . . . . . 135 24.4 Tools Thrust . . . . . . . . . . . . . 136

    24.4.1 ADaM . . . . . . . . . . . . . 136 24.4.2 ADAS and WRF 3DVAR . . . . . . . . 137 24.4.3 Common Data Model . . . . . . . . . 138 24.4.4 IDV . . . . . . . . . . . . . 138 24.4.5 LDM and IDD . . . . . . . . . . . 138 24.4.6 Siege . . . . . . . . . . . . . 138 24.4.7 THREDDS . . . . . . . . . . . . 139 24.4.8 WRF . . . . . . . . . . . . . 139 24.4.9 Noesis . . . . . . . . . . . . . 140

    24.5 Meteorology Thrust . . . . . . . . . . . 140 24.6 Portal Thrust . . . . . . . . . . . . . 141 24.7 Community Engagement and Deployment . . . . . 142 24.9 Education and Outreach . . . . . . . . . . 143 24.10 Diversity Enhancement . . . . . . . . . . 145 25. Year-4 Budget Justifications . . . . . . . . . . 145 25.1 Budget Allocations Among Participating Institutions . . 145 25.2 University of Oklahoma . . . . . . . . . . 146 25.3 University of Illinois . . . . . . . . . . . 148 25.4 University of Alabama in Huntsville . . . . . . . 150

  • 5

    25.5 UCAR Unidata Program . . . . . . . . . . 152 25.6 Indiana University . . . . . . . . . . . . 154

    25.7 Howard University . . . . . . . . . . . . 156 25.8 Millersville University . . . . . . . . . . . 157 25.9 Colorado State University . . . . . . . . . . 159

    Appendix A. Year-2 Site Visit Agenda . . . . . . . . . 162 Appendix B. Year-2 Site Visit Report, Responses and Updates. . . 165 Appendix C. Selected Acronyms Used in LEAD . . . . . . 188 Appendix D. Agenda for the Inaugural Meeting of FGIT . . . . 190

  • 6

    1. Introduction – the Genesis and Mission of LEAD On 1 October 2003, the National Science Foundation funded a Large Information Technology Research (ITR) grant known as Linked Environments for Atmospheric Discovery (LEAD). A multi-disciplinary effort involving 9 institutions and more than 100 scientists, students and technical staff, LEAD is creating an integrated, scalable framework in which meteorological analysis tools, forecast models, and data repositories can operate as dynamically adaptive, on-demand, grid-enabled systems that a) change configuration rapidly and automatically in response to weather; b) respond to decision-driven inputs from users; c) initiate other processes automatically; and d) steer remote observing technologies to optimize data collection for the problem at hand. Although mesoscale meteorology is the particular domain to which these concepts are being applied, the methodologies and infrastructures being developed are extensible to others including as medicine, ecology, hydrology, geology, oceanography and biology. LEAD has adopted a service-oriented architecture with two principal objectives:

    • To lower the entry barrier for using, and increase the sophistication of problems that can be addressed by, complex end-to-end weather analysis and forecasting/simulation tools. Existing weather tools such as data ingest, quality control, and analysis/assimilation systems, as well as simulation/forecast models and post-processing environments, are enormously complex even if used individually. They consist of highly sophisticated software developed over long periods of time, contain numerous adjustable parameters and inputs, require one to deal with complex formats across a broad array of data types and sources, and often have limited transportability across computing architectures. When linked together and used with real data, the complexity increases dramatically. Indeed, the control infrastructures that orchestrate interoperability among multiple tools – which notably are available only at a few institutions in highly customized settings – can be as complex as the tools themselves, involving thousands of lines of code and requiring months to understand, apply and modify. Although many universities now run experimental forecasts on a daily basis using public-domain software such as the Weather Research and Forecast (WRF) model, they do so in very simple configurations using mostly local computing facilities and pre-generated analyses to which no new data have been added. LEAD seeks to democratize the availability of advanced weather technologies for research and education, lowering the barrier to entry, empowering application in a grid context, increasing the realism of how technologies are applied, and facilitating rapid understanding, experiment design and execution.

    • To improve our understanding of and ability to detect, analyze and predict mesoscale

    atmospheric phenomena by interacting with weather in a dynamically adaptive manner. Most technologies used to observe the atmosphere, predict its evolution, and compute, transmit and store information about it operate not in a manner that accommodates the dynamic behavior of mesoscale weather, but rather as static, disconnected elements. Radars do not adaptively scan specific regions of storms, numerical models mostly are run on fixed time schedules in fixed configurations, and

  • 7

    cyberinfrastructure does not allow meteorological tools to operate on-demand, change their mode in response to weather, or provide the fault tolerance needed for rapid reconfiguration. As a result, today’s weather technology, and its use in research, operations and education, are far from optimal when applied to any particular situation [1]. To address these severe limitations, LEAD is

    o Developing capabilities to allow models and other atmospheric tools to respond

    dynamically to their own output, to observations, and to user inputs so as to operate as effectively as possible in any given situation;

    o Developing, in collaboration with the NSF Engineering Research Center for

    Collaborative Adaptive Sensing of the Atmosphere (CASA), capabilities to allow models and other atmospheric tools to dynamically task adaptive observing systems, with an emphasis on Doppler radars, to provide data when and where needed based upon the application, user or situation at hand;

    o Developing appropriate adaptive capabilities within supporting IT infrastructure.

    2. System Capabilities and the LEAD Environments LEAD comprises a complex array of services, applications, interfaces, and local and remote computing, networking and storage resources – so-called environments – that can be used in a stand-alone fashion or linked together in workflows to study mesoscale weather; thus the name “Linked Environments for Atmospheric Discovery.” This framework provides users with an almost endless set of capabilities ranging from simply accessing data and perhaps visualizing it to running highly complex and linked data ingest, assimilation and forecast processes in real time and in a manner that adjusts dynamically to inputs as well as outputs. A brief overview of the LEAD service-oriented architecture (SOA) is presented in §3 and additional detail can be found in the year-2 annual report (see also [1]). Figure 2.1 shows the logical structure of the LEAD environments. At the fundamental level of functionality, as shown by the top horizontal gray box, LEAD enables users to accomplish the following:

    • Query for and Acquire a wide variety of information including but not limited to observational data (including real time streams) and gridded model output stored on local and remote servers, definitions of and interrelationships among meteorological

    Achieving these goals requires more than translating existing capabilities (e.g., batchprocessing of numerical models configured using name list input files) into new ITframeworks built upon symbolic workflow task graphs, web services and grid computing.Rather, it requires fundamental changes – underpinned by basic research in meteorologyand CS/IT – in how experiments are conceived and performed, in the structure of userapplication tools and middleware, and in methodologies used to observe the atmosphere.The stretch goal for LEAD is to begin ushering in this paradigm change.

  • 8

    quantities, the status of an IT resource or workflow, and education modules at a variety of grade levels that are designed specifically for LEAD.

    Figure 2.1. The fundamental capabilities (top gray box) and tangible outcomes (bottom gray box) of LEAD are enabled by a rich fabric of tools, functions and middleware services that

    represent the LEAD research domain.

    • Simulate and Predict using numerical atmospheric models, particularly the Weather

    Research and Forecast (WRF) model system now being developed by a number of organizations. The WRF can be run in a variety of modes ranging from basic (e.g., single vertical profiles of temperature, wind and humidity in a horizontally homogeneous domain) to very complex (full physics, terrain, and inhomogeneous initial conditions in single forecast or ensemble mode). Other models (e.g., ocean) can be included but are not fundamentally part of the LEAD system now being created.

    • Assimilate data by combining observations, under imposed dynamical constraints, with

    background information to create a 3D atmospheric gridded analysis. As noted in the tools description below, LEAD supports the ARPS Data Assimilation System (ADAS) and will incorporate the WRF 3D Variational (3DVAR) Data Assimilation System beginning in year-4 (see §24.4.2).

    LEAD Lexicon and Hierarchy of CapabilitiesLEAD Lexicon and Hierarchy of Capabilities

    FundamentalCapabilities

    Simulate Assimilate PredictAnalyze/Mine Visualize

    Query

    Acquire

    EnablingFunctions

    Store

    Execute

    Publish Authorize

    Move

    Edit

    Configure

    Monitor

    FoundationalUser Tools

    ADAS WRF IDVADaM

    Catalog

    Outcomes Data Sets Model Output AnimationsGridded Analyses Static Images Relationships

    Build

    Compile

    Allocate

    TriggerSteer

    ManageDefine

    Query & Acquire

    Portal

    MiddlewareServices

    • Authorization• Authentication• Notification • Monitoring• Workflow• Security• ESML

    • Ontology• Host Environment• GPIR• Application Host• Execution Description• Application Description

    • VO Catalog• THREDDS Catalog• MyLEAD• Control• Query• Stream• Transcoder

    MiddlewareServices

    • Authorization• Authentication• Notification • Monitoring• Workflow• Security• ESML

    • Ontology• Host Environment• GPIR• Application Host• Execution Description• Application Description

    • VO Catalog• THREDDS Catalog• MyLEAD• Control• Query• Stream• Transcoder

    New Knowledge, Understanding, Ideas

    MyLEAD

    Decode

  • 9

    • Analyze and Mine observational data and model output to obtain quantitative

    information about spatio-temporal relationships among fields, processes, and features. • Visualize and quantitatively evaluate observational data and model output in 1D, 2D

    and 3D frameworks using batch and interactive tools. LEAD comprises a large number of tools ranging from simple services to highly sophisticated meteorological, data mining and visualization packages. Within this array we define a sub-set of foundational application or productivity tools that include:

    • LEAD Portal, which serves as the primary though not exclusive user entry point into the LEAD environments;

    • ARPS Data Assimilation System (ADAS; [2]), a sophisticated tool for data quality

    control and assimilation including preparation of model initial conditions;

    • myLEAD [3], a flexible personalized data management tool that at its core is a metadata catalog. myLEAD stores metadata associated with data products generated and used in the course of scientific investigations and education activities.

    • Weather Research and Forecast model (WRF; [4]), a next-generation limited-area

    atmospheric prediction and simulation model that runs on single or multiple processors at grid spacings ranging from meters to hundreds of kilometers;

    • Algorithm Development and Mining (ADaM; [5]), a powerful suite of tools for

    mining observational data, assimilated data sets and model output; and • Integrated Data Viewer (IDV; [6]), a widely used desktop application for visualizing,

    in an integrated manner, a broad array of multi-dimensional geophysical data.

    The power of LEAD lies not only in the capabilities of its various tools but more importantly in the manner in which they can be linked together to solve a broad array of problems, as shown schematically in Figure 2.2. The tangible outcomes (bottom bar in Figure 2.1) include data sets, model output, gridded analyses, animations, static images, and a wide variety of relationships and other information that leads to new knowledge, understanding and ideas. The fabric in Figure 2.1 that links the top set of requirements with the bottom set of outcomes – namely, the extensive middleware, tool and service capabilities – is the research domain of LEAD and is described throughout the remainder of this report.

    3. LEAD Architecture As shown in Figure 3.1, the LEAD SOA is realized as five distinct yet highly interconnected layers (detailed descriptions of each service are provided in [1]). The bottom layer represents raw resources consisting of computation as well as application and data resources distributed throughout the LEAD Grid (see §4) and elsewhere. At the next level up are web services that

  • 10

    provide access to “raw/basic” capabilities and services for accessing weather data. LEAD is leveraging these resources from other projects and modifying them as appropriate. A wide variety of configuration and execution services compose the next layer and represent services invoked by LEAD workflows. They are divided into four principal groups, the first being the application and configuration service that manages the deployment and execution of fundamental user applications such as the WRF model, ADAS data assimilation system, and ADaM data mining tools.

    Figure 2.2. Conceptual/functional linkages among components of LEAD. Virtually any

    mesoscale research or educational problem can be mapped onto this figure.

    For each of these, additional services are needed to track deployment and execution environment requirements to enable dynamic staging and execution on any of the available host systems. A closely related service is the application resource broker, which is responsible for matching the appropriate host for execution to each application task based upon time constraints of the execution and other factors. Both of these services are invoked by workflow services, which drive experimental workflow instances. Catalog services control the manner in which a user discovers data for use in experiments via a virtual organization (VO) catalog. Finally a host of data services are used to search for and apply transformations to data products. An ontology service resolves higher-level atmospheric concepts to specific naming schemes used in the various data services, and decoder and interchange services transform data from one form to another. Stream services manage live data streams such as those generated by the NEXRAD Doppler radar network.

    IdentifyProblem

    Define RegionOf Interest,

    Data & ToolsRequirements

    Query forDesired

    Data

    ConfigureTools

    AllocateComputing and

    StorageResources

    AcquireData

    Create/EditMeta Data

    IDV DataVisualiza-

    tion

    ADaM Data

    Mining

    MyLEADStorage

    DecodeData

    Remapping,Conversion,

    QualityControl

    ADAS Objective

    Analysis/WRF3DVAR

    Remappingof ADAS to WRF

    Coordinate

    EnsembleInitial

    ConditionGeneration

    WRFForecast

    Model/DataAssimilation

    Decision TriggerCASA

    DCAS RadarsCASA

    DCAS RadarsUser

    Mon, Alloc

    Mon, Alloc Mon, Alloc Mon, Alloc Mon, Alloc Mon, Alloc

    Mon, Alloc

    Monitor

    Mon, Alloc Mon, Alloc

    Mon, Alloc

    Mon, Alloc Mon, Alloc Mon, Alloc

    Mon, Alloc

    Mon, Alloc

    Mon, Alloc

    Mon, Alloc

  • 11

    Figure 3.1. The LEAD service-oriented architecture.

    Several services are used within all layers of the SOA and are referred to as crosscutting services, indicated in the left column of Figure 3.1. One such service is the notification service, which lies at the heart of both static and dynamic workflow orchestration. Each service is able to publish notifications and any service or client can subscribe to receive them. Another critical component is the monitoring service, which provides, among other things, mechanisms to ensure that desired tasks are completed by the specified deadline – an especially important issue in weather research and education. A vital crosscutting service that ties multiple components together is the user metadata catalog known as myLEAD. As an experiment runs, it generates data that are stored on the LEAD Grid or elsewhere (e.g., TeraGrid) and cataloged to the user’s myLEAD catalog. Notification messages generated during the course of workflow execution also are written to metadata and stored on behalf of a user. A user accesses metadata about the products used during or generated by an investigation through a set of metadata catalog-specific user interfaces built into the LEAD Portal. Note that users can edit metadata, and that LEAD has developed a specific schema based upon existing standards. Through these interfaces the user can browse holdings, search for products based on rich meteorological search criteria, publish products to broader groups or to the public, snapshot an experiment for archiving, or upload text or notes to augment the experiment holdings. Authentication and authorization are handled by specialized services based upon grid standards. Finally, at the top level of the architecture in Figure 3.1 is the user interface, which consists of the LEAD portal and a collection of “service-aware” desktop tools . The portal is a container for user interfaces, called portlets, which provide access to individual services. When a user

    DistributedResources

    Computation SpecializedApplicationsSteerable

    Instruments Storage

    Data Bases

    ResourceAccess Services GRAM

    Grid FTP

    SSH

    Scheduler

    LDM

    OPenDAP GenericIngest Service

    UserInterface

    Desktop Applications• IDV• WRF Configuration GUI

    LEAD Portal

    PortletsVisualization Workflow Education

    Monitor

    Control

    Ontology Query

    Browse

    Control

    CrosscuttingServices

    Authorization

    Authentication

    Monitoring

    Notification

    Con

    figur

    atio

    n an

    d E

    xecu

    tion

    Serv

    ices WorkflowMonitor

    MyLEAD

    WorkflowEngine/Factories

    VO Catalog

    THREDDS

    Application ResourceBroker (Scheduler)

    Host Environment

    GPIR

    Application Host

    Execution Description

    WRF, ADaM,IDV, ADAS

    Application Description

    Application & Configuration Services

    Client Interface

    Observations• Streams• Static• Archived

    Dat

    a Se

    rvic

    es

    Wor

    kflo

    wSe

    rvic

    esC

    atal

    ogSe

    rvic

    es

    RLSOGSA-

    DAI

    Geo-Reference GUI

    ControlService

    QueryService

    StreamService

    OntologyService

    Decoder/ResolverService

    TranscoderService/ ESML

  • 12

    logs into the portal, his or her grid authentication and authorization credentials are loaded automatically. Each portlet can use these certificates to access individual services on behalf of the user, thus allowing users to command the portal to serve as his or her proxy for composing and executing workflows on back-end resources. Alternatively, users may access services by means of desktop tools. For example, the Integrated Data Viewer (IDV) can access and visualize data, and provide domain sub-setting capability, using a variety of sources including OPeNDAP servers. Similarly, the workflow composer tool can be used to design a workflow on the desktop that can be uploaded to the user’s myLEAD space for later execution, as at the Unidata Users Workshop (see §10). 4. The LEAD Grid The LEAD Grid is a series of distributed computing systems located at six of the nine participating institutions (Oklahoma, Unidata, Illinois, Indiana, Alabama in Huntsville, and University of North Carolina – Figure 4.1). It represents a “clean room environment” for experimentation in which version control (e.g., of Globus) can be strictly managed. The resources at each site range from single-CPU Linux systems with a few terabytes of local storage to large cluster-based systems and mass storage facilities capable of serving many petabytes of data. The LEAD Grid is built upon a foundation consisting of two systems: the Globus Grid infrastructure framework and the Unidata Local Data Manager (LDM).

    Figure 4.1. The LEAD grid.

    In the first two years of the project, most of the development and testing were performed on the LEAD Grid. During the past year we have migrated applications to the TeraGrid and use the LEAD Grid mostly to host data sets of relevance to mesoscale meteorology as well as the LEAD Portal. Each grid node runs the Unidata LDM/IDD software to ingest and make available several months worth of data including those from NEXRAD radars, upper-air

    Unidata

    OU

    UI IU

    UAH

    UNC

    The LEAD Grid

  • 13

    balloons, satellites, surface networks, commercial aircraft, wind profilers, and other systems. Further, the Grid sites host static data sets such as terrain and vegetation information. 5. Research Agenda and Strategic Timeline 5.1. Balancing Research and Deployment From project inception, the LEAD team has faced the challenge of striking an appropriate balance between software developed to explore fundamental concepts versus the instantiation of this software as stable, persistent capabilities in well engineered systems deployed for use by the broader community. Although resource constraints prevent full attention from being given to both research and deployment, LEAD has created conceptual and practical frameworks for achieving what is believed to be an appropriate balance. As shown in Figure 5.1.1, LEAD research begins with fundamental ideas (oval in the lower center of the diagram) that lead to basic research and an overarching system architecture. The resulting software components, most of which are developed piecemeal, then are integrated to provide increasingly greater functionality as part of so-called integrative test beds (ITBs, described in §5.2). A vitally important part of LEAD, ITBs are experimental frameworks for instantiating LEAD system components in a coordinated manner to evaluate fundamental concepts in an end-to-end fashion using selected end user testers. The former ensures an integrated systems-orientation to testing while the latter allows LEAD to obtain very specific user feedback on system design, capabilities, performance, etc. The integrative test beds spawn new ideas, from which new capabilities are developed and tested in a cyclic manner (Figure 5.1.1). Note that most of the end user testers have been identified and educated about LEAD by the Education and Outreach Thrust. They include teachers in grades 6-12, students in grades 6 through graduate school, researchers in meteorology and computer science, non-research faculty, attendees at workshops (e.g., WRF Community Workshop, Unidata Users Workshop), and other special groups (e.g., participants in the NOAA/NCAR Developmental Test Bed Center). Efforts now are being extended to expose LEAD to other communities (e.g., oceanography; see §24.7), though judiciously owing to limited resources. The conceptual (green) and practical (yellow) activities in Figure 5.1.1 are fundamentally part of LEAD during its 5-year tenure as an NSF ITR grant. The formal deployment of LEAD as an integrated system – something one might term “operational capability” in a 24/7 environment where user expectations are high and reliability is essential – resides in the red part of the diagram and for the most part is beyond the scope of the 5-year research grant. However, recognizing the importance of exposing LEAD to an audience larger and broader than our limited number of end user testers, Unidata will support somewhat scaled-down versions of ITB capabilities in a persistent manner (see §5.2) for its community that involves thousands of users (principally at the undergraduate and graduate level and including numerous faculty). Initial emphasis will be placed on data acquisition which, according to a recent study [7] prepared specifically for LEAD, is the need most frequently articulated by users.

  • 14

    Figure 5.1.1. The LEAD research process and role of integrative test beds as an instantiation

    of end-to-end systems concepts evaluated by end user testers. Most of the LEAD activity as a 5-year ITR grant resides to the left of the dashed line, though some limited deployment of

    persistent services to the broader community will be orchestrated by Unidata. Concept from D. McLaughlin, NSF Engineering Research Center for Collaborative Adaptive Sensing of the

    Atmosphere (CASA).

    5.2. Strategic Timeline and Integrative Test Beds Figure 5.2.1 shows the LEAD strategic timeline, which is founded upon three ITBs that sequentially build upon one another. Each is designed around a specific set of capabilities and each has a principal goal that addresses key research and education issues. Note that the starting and ending times of the ITBs, apart from the starting time of ITB1 (see below), intentionally are very “soft,” as implied by the gradient shading in the figure. As each ITB proceeds, early capabilities within them will become more stable and will be transitioned to the Unidata deployment, as noted above, while other capabilities are added. The goal of the first integrative test bed, ITB1, is to expose end user testers to a service-oriented architecture containing the principle elements of LEAD with an eventual view toward democratizing complex experimentation with meteorological tools and creating an architecture suitable for studying adaptive capabilities. The specific features associated with ITB1 are:

    • Multiple static workflows that can be selected from a repository, compiled and run • Access to multiple data sets (e.g., METAR, rawinsonde, NCEP grids, ACARS)

    including streams (NEXRAD Level II, CASA) • The ability to search for and manage data and products (includes ontology & dictionary) • Data sub-setting within the query system and IDV

  • 15

    • Creation of ADAS and/or WRF 3DVAR analyses over user-configured domains at specified grid spacings

    • Creation of a single WRF forecast over user-configured domains at user-specified grid spacings based upon ADAS, 3DVAR or NAM analyses

    • Visualization of observations and/or grids using IDV • Automatic meta data generation, cataloging and storage • Pre-scheduling of runs using resource brokering on the TeraGrid with real time

    monitoring • Feature extraction of basic parameters from gridded model output and streaming

    NEXRAD Level II radar data

    Figure 5.2.1. The LEAD strategic timeline built around three integrative test beds. See the text

    for further details.

    The meteorological and computer science research and education issues to be addressed by ITB1 include but are not limited to

    • Assessing the impact of data types, grid spacing, and other parameters on analyses and

    forecasts • Evaluating user reaction and adaptation to the service-oriented architecture • Evaluating the value of a service-oriented architecture in the conduct of research and

    education • Assessing the value of LEAD in providing access to data sets including streams

  • 16

    • Quantifying system response time in preparation for dynamically adaptive systems • Evaluating a desktop client-based system, Siege, compared to BPEL • Extracting features from observations, forecasts and analyses to prepare for dynamic

    adaptivity • Assessing the ability of the TeraGrid to accommodate dozens of LEAD users

    simultaneously • Evaluating the ability to monitor a variety of LEAD system components, including

    cyberinfrastructure (CI), and make the information known via notification services • Understanding scheduling constraints on the TeraGrid and using them as a means for

    developing on-demand capability with some prior notification • Testing LEAD on a wide variety of grid systems

    ITB1 formally began during the week of 10-14 July 2006 in Boulder, Colorado with the set of capabilities put before 82 participants at the Unidata Users Workshop titled “Expanding the Use of Models as Educational Tools in the Atmospheric & Related Sciences” (see §10). Capabilties included pre-scheduled resources on the TeraGrid for launching WRF forecasts initialized from NCEP and ADAS analyses based upon user-specified domain location and grid spacing, basic cataloging of input and output, use of workflow task graphs, and real time monitoring and meta data generation. The specific work associated with ITB1 is described throughout the remainder of this document and “tire track” plots are being developed to more clearly portray the temporal sequencing of detailed research efforts and the instantiation of their outcomes into ITB1 and subsequent integrative test beds. Simultaneous with ITB1 is so-called “look-ahead” research (Figure 5.2.1) that will be implemented in ITB2 (see §5.3). This research emphasizes dynamic adaptation (workflows, models, cyberinfrastructure), streaming static data, and workflows that users can compose from a variety of services (this capability exists now to some extent but has not been made available in general ways within the LEAD Portal). ITB2 will serve to test dynamically adaptive capabilities and provide on-demand scheduling on the TeraGrid, the creation of assimilated data sets using ADAS or WRF 3DVAR, WRF forecasts based upon these assimilated data sets including ensembles, more sophisticated monitoring that includes elements of resource prediction, and more capable meta data generation and automated cataloging. ITB2 also will bring in additional service capabilities focused on analysis and mining, as described below. The look-ahead research parallel with ITB2 (Figure 5.2.1) moves LEAD fully into the dynamically adaptive realm with reconfigurable workflows, dynamic resource allocation on the TeraGrid, and steering of the CASA radars via feature detection in models and observations. The third test bed, ITB3, reflects this research and focuses on dynamic workflows linked to the dynamically adaptive CASA radars. 5.3. Research Agenda for Dynamic Adaptation

    As noted in §1, the second goal of LEAD, in addition to democratization, is dynamic adaptivity to weather of meteorological tools, cyberinfrastructure and observing systems, particularly Doppler radars via CASA. This is the transformative research challenge upon which LEAD was founded and to meet it, we have created a research agenda centered around three use case

  • 17

    scenarios, as recommended by the year-2 site visit team and consistent with our earlier use of canonical research problems (see Appendix A of the year-2 annual report). 5.3.1. Working Definitions and Key Questions & Issues

    Before describing our research agenda in dynamic adaptation it is important to establish working definitions of key terms as used by LEAD:

    • On-Demand – The capability to perform action immediately, with or without prior planning or notification;

    • Real-Time – The transmission or receipt of information about an event nearly simultaneously with its occurrence, or the processing of data immediately upon receipt or request;

    • Dynamically-Adaptive – The ability of a system as a whole, or any of its components, to respond manually or automatically, in a coordinated fashion, to internal and external influences in a manner that optimizes overall system performance;

    • Streaming (from D. Luckham) – A real-time, continuous, ordered (implicitly by arrival time or explicitly by timestamp) sequence of items.

    Additionally, it is important to recognize that adaptation can take many forms but in all cases the objective of adaptive systems is to improve upon their static counterparts in some manner, ideally one that formally optimizes or at least quantitatively improves upon certain aspect(s) of performance. Systems or components may adapt in time, space or modality and the adaptation can be automated, manual, objective, heuristic, etc. Further, adaptation can occur in a variety of locations within the system (i.e., within the LEAD environments), at multiple levels and in highly connected, nonlinear ways. Given this complexity, it is useful to frame the associated research agenda by key issues and questions in adaptivity that implicitly include concepts of streaming and on-demand functionality. The list below is not intended to be exhaustive but rather representative of the most compelling issues relevant to and being addressed by LEAD:

    • When is adaptation useful? • Can the cost and benefit of adaptation be quantified? • What types of adaptation are possible? • What types of adaptation are most effective and how can they be chosen and combined? • How is adaptivity triggered/controlled? • What elements of the system can or should adapt (application, workflow,

    cyberinfrastructure, observing systems, combinations of these)? • How can one deal with loss of resources or less than ideal availability to achieve the

    required adaptation? • What metrics can be used to measure the effectiveness of adaptation? • What negative consequences exist to adaptation? • Can “optimal” adaptivity be defined? • What are the time scales of adaptivity and what controls them?

  • 18

    • Do adaptivity and on-demand functionality need to be pre-scheduled to any extent? • What triggers the decision to adapt and how is the decision communicated across the

    system? • How can applications most effectively be maintained in “stand-by” mode, ready to be

    invoked by an adaptive trigger? • What does quality of service mean in an adaptive system?

    5.3.2. Dynamic Adaptation Use Case #1

    With that preface, Dynamic Adaptation Use Case #1, shown in Figure 5.3.2.1, involves operating on streaming NEXRAD Level II observations, and/or ADAS analyses, with a persistent “listener” or agent (e.g., via components of the ADaM data mining engine) that identifies prescribed features within radar reflectivity or radial velocity. Upon locating a region in which the criteria have been met, a single grid WRF forecast is automatically launched in an on-demand fashion on the TeraGrid using capabilities now largely resident within ITB1. In this case the workflow is the system component that responds to the event trigger by launching a WRF job while all other elements of the system remain static.

    Figure 5.3.2.1. Dynamic Adaptation Use Case #1: A single-grid WRF forecast is triggered when a persistent data mining agent detects high reflectivity in streaming NEXRAD Level II

    data and/or an ADAS analysis. Although seemingly simple, this use case is rich with complexity and involves, among other things, basic research in trigger mechanisms, the optimal choice of WRF configurable parameters (e.g., domain size, grid spacing, forecast start time, trigger criteria), strategies for dealing with delays in data and cyberinfrastructure acquisition, on-demand resource allocation,

  • 19

    real time monitoring, and the intrinsic value of dynamic weather forecasts in comparison to their static counterparts, e.g., those run on fixed schedules in fixed configurations. With regard to the last point, one of the potentially negative consequences to adaptation in the context of operational meteorology is that forecasters may not easily be able to learn the behavior of a given model in an adaptive framework because, in principle, every forecast could be run using a different configuration, i.e., tied dynamically to the event of the day or even hour. Such tradeoffs between dynamic and static systems are a fundamental component of LEAD research and these and other issues will be evaluated in a variety of ways, including in an operational forecast context as part of the ongoing NOAA Storm Prediction Center Spring Program (§14.6). During 2007 (§24.7) and subsequently, LEAD is expected to be the major provider of daily experimental WRF forecasts and will evaluate forecaster response to both dynamic and static capabilities. We envision exploring a limited number of extensions to Dynamic Adaptation Use Case #1 to evaluate scalability and other system behavior, e.g., via mining multiple radars simultaneously (a capability that now exists) or mining the NSSL radar mosaic; streaming CASA or NEXRAD radar data directly into memory rather than using files; dealing with data outages, missing WRF input files or incorrectly specified WRF parameters; modifying the model configuration if only some percentage of the requested cyber resources are available; providing likelihood estimates that cyber resources will be needed as a means for soft on-demand capability (e.g., modifying the probability throughout time based upon the continuous detection of radar features) – a capability that will need to be included in the TeraGrid, perhaps using the NCSA MOAB scheduler (§24.4.6). 5.3.3. Dynamic Adaptation Use Case #2 Dynamic Adaptation Use Case #2, shown in Figure 5.3.3.1, extends Use Case #1 by adding a data mining component for detecting features in WRF forecast output and using this information, or that obtained from persistent mining of NEXRAD radar data or ADAS/3DVAR analyses, to refine an existing WRF grid over specified regions, e.g., intense storms, low-level boundaries. Multiple options exist for accomplishing the refinement, including launching an entirely new WRF forecast (workflow adapts) at finer spacing or creating a nested grid within WRF itself (application adapts). The choice among options, not all of which are listed here, represents a fundamental research issue in dynamically adaptive systems and will be explored with this use case. As for Use Case #1, this seemingly simple example contains significant research questions regarding optimality, dealing with potentially competing strategies for assessing the trigger condition and effectuating adaptation (here, grid refinement), and continuing the “parent” WRF run to provide boundary conditions for the next domain versus running a one-way nest. The latter, for example, may depend upon the availability of cyber resources. Extensions to this scenario include the use of multiple nests, mechanisms for choosing multiple small nested domains versus a single large fine-grid nest (again possibly a function of available machine partitions), adding ensembles based upon a specified condition, and using precursors of weather

  • 20

    Figure 5.3.3.1. Dynamic Adaptation Use Case #2: An extension of Use Case #1 in which WRF output is mined to determine when, where and how to emplace one or more finer-spacing grids.

    (e.g., tornado watches) as a trigger to launch a coarse-grid background forecast in preparation for finer-grid nests (a capability that is planned as an extension to Use Case #2).

    5.3.4. Dynamic Adaptation Use Case #3 Finally, Dynamic Adaptation Use Case #3, shown in Figure 5.3.4.1, departs from the previous two cases in that it does not involve WRF or ADAS but rather the detection of features within streaming CASA data produced by the 4-radar network now in place in Oklahoma and use of this information to re-task the radars to meet a specified objective (e.g., the identification of a tornado, heavy rain region, convergence line). This capability is planned to exist within CASA using algorithms and models built into the CASA meteorological command and control (MC&C) software. However, interfacing the external LEAD system to CASA is a principal research challenge. Use Case #3 sets the stage for combining its features with those from the other two scenarios to arrive at a “closed loop” capability, shown in Figure 5.3.4.2, in which streaming data, workflows, tools and cyberinfrastructure mutually interact with mesoscale weather. 5.4. Mapping the Research Agenda A project as complex as LEAD requires careful coordination and planning. Specific research tasks embedded in the three use cases just described have been mapped into detailed timelines across thrusts and organizations (see year-2 annual report), with the PI of each institution, and the cross-institutional thrust leaders, responsible for ensuring timely completion. To ensure coordination, LEAD employs requirements traceability matrices [7] and uses them as a road map for weekly AccessGrid integration and other meetings (see §6).

  • 21

    Figure 5.3.4.1. Dynamic Adaptation Use Case #3: Streaming data from the CASA test bed in

    Oklahoma are mined and the results used to re-task the radars.

    Figure 5.3.4.2. The closed-loop dynamically adaptive LEAD system, in which streaming data,

    workflows, tools and cyberinfrastructure mutually interact with mesoscale weather.

  • 22

    6. Project Administration and Oversight

    6.1. Project Manager and Technical Integration Coordinator

    The governance of LEAD was described in the year-2 annual report and is not further discussed here apart from changes. Starting in year-4, Dr. Daniel Weber, Senior Research Scientist at CAPS, will assume the role of Project Manager and Technical Integration Coordinator (Figure 6.1.1), the former of which was held by Ms. Terri Leyton and the latter by Dr. Robert Wilhelmson. Midway through year-2, Ms. Leyton and her husband relocated to North Carolina and the LEAD team agreed that, for the sake of continuity, she would continue as Project Coordinator at 0.5 FTE until the end of year-3. Dr. Wilhelmson has been consumed by administrative issues at NCSA and will be required to spend considerable time in the next several months on other activities, though he will continue to participate in LEAD. Consequently, combining both his and Ms. Leyton’s previous functions into a single position at a particularly appropriate time will yield important benefits for LEAD. Dr. Weber is in the process of visiting every LEAD institution to gain deeper insight into thrust and institutional activities and, with his background in meteorology and computing is ideally suited among available personnel to assume this broad, integrative responsibility. The LEAD web site, which Ms. Leyton has maintained, is being combined with the Portal, thereby reducing the overhead associated with the maintenance of two independent resources. Administrative support for scheduling site visits, EAP meetings and other related logistics will be assumed by the University of Oklahoma with assistance from other LEAD institutions.

    Figure 6.1.1. LEAD organization chart starting in year-4 (1 October 2006).

    External Advisory Panel

    University of Oklahoma(K. Droegemeier, PI; M.

    Xue, K. Brewster, D. Weber, Co-PIs)

    Meteorological Research,Education

    PI and Project Director(K. Droegemeier)

    Executive Committee (PIs)

    University of Alabama in Huntsville

    (S. Graves, PI; R. Ramachandran, J.

    Rushing, Co-PIs) Data Mining, Interchange

    Technologies, Semantics

    UCAR/Unidata(M. Ramamurthy, PI; A.

    Wilson, Co-PIs)Data Streaming and Distributed Storage

    Indiana University(D. Gannon, PI; Beth

    Plale, Co-PI)

    Data, Workflow, Orchestration, Services

    University of Illinois/NCSA + UNC

    (R. Wilhelmson, PI + D. Reed, PI)

    Monitoring and Data Management

    Millersville University(R. Clark, PI; S. Yalda,

    Co-PI)

    Education and Outreach

    Howard University(E. Joseph, PI)

    Meteorological ResearchEducation and Outreach

    Colorado State University

    (Chandra, PI)

    Instrument Steering, Dynamic Updating

    National Science Foundation

    Partner Organizations

    Technical/Integration and Project Coordinator

    (D. Weber)

  • 23

    6.2. Internal and External Assessments of the LEAD Enterprise In preparation for the spring all hands meeting held in St. Louis, Missouri on 4-5 May 2006, LEAD conducted an internal SWOT (strengths, weaknesses, opportunities and threats) analysis where responses were submitted anonymously and compiled by Ms. Leyton. Simultaneously, Dr. Katherine Lawrence and colleagues from the University of Michigan completed a year-long study of the LEAD R&D Team [8]. Both compilations were discussed at the all hands meeting and a number of actions have since been taken to improve collaboration and decision making, clarify strategic direction, delineate roles and responsibilities, and better define the demarcation between research and deployment. The latter outcome was described in §5.1 and, in conjunction with other meetings held at NCSA and during TeraGrid ’06 (Table 6.3.1), LEAD has considerably been strengthened. 6.3. Collaboration Mechanisms

    As noted in the first two annual reports, LEAD relies heavily upon electronic tools, especially the Access Grid (AG) and LEAD web site (http://lead.ou.edu), to facilitate remote collaboration as a supplement to planned and ad hoc face-to-face meetings. As shown in Figure 6.3.1, two face-to-face all-hands meetings are scheduled each year, along with an External Advisory Panel meeting. [No external advisory panel meeting was held in 2005 owing to the year-2 site visit.] A summary of the year-2 site visit report, LEAD’s responses, and an update of actions taken since that visit are provided in Appendix B. The fall 2006 all-hands meeting will be held in September or October, probably at the University of Oklahoma and potentially involving participants in CASA.

    Figure 6.3.1. LEAD calendar.

    During the first two years of LEAD, more than a dozen AG sessions were held monthly among the thrust groups and integration team, along with a bi-weekly PI conference call. Once per month, a 2-hour all-hands AG session also was held. Notes and action items from each of these sessions were prepared by T. Leyton and placed on the private portion of the LEAD web site. Materials from face-to-face meetings, including presentations and notes, likewise were posted. Although effective, the large number of meetings became a detriment to productivity, particularly as software development began in earnest. To compensate, the number of AG sessions was reduced while the weekly (Thursday) session was retained, along with the weekly (Friday) PI conference call. The effectiveness of this reduction continued to be debated and the

    Sept Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug12 1 2 3 4 5 6 7 8 9 10 11

    All-HandsMeeting

    External Advisory Panel

    AnnualReport

    Weekly AccessGrid and Ad Hoc Face-to-Face Meetings

    Month in Grant Year

    All-HandsMeeting

  • 24

    present schedule (Figure 6.3.2), which appears to be optimal, involves two 1-hour AG sessions per week (Monday and Thursday), the first rotating among the research thrusts and the second devoted to integration. Thrust leaders are responsible for creating the agenda and running the Monday meeting while the Thursday meeting now is facilitated by Dr. Weber, Technical Integration Coordinator and Project Manager.

    Figure 6.3.2. LEAD weekly meeting schedule..

    Owing to the limitations of electronic collaboration, LEAD conducts several ad hoc face-to-face meetings each year, some of which are scheduled during national conferences to reduce travel and facilitate attendance as well as cross-disciplinary scholarship. The face-to-face meetings held during or leading up to year-3 are summarized in Table 6.3.1.

    Table 6.3.1. Face-to-Face Planning Meetings and Reviews Held During or Leading Up to

    Year-3.

    Purpose Location Date LEAD Year-2 Site Visit Urbana-Champaign, IL July 20-22, 2005 Special PI Meeting Bloomington, IN August 22, 2005 Special Data Subsystem Interoperability Meeting Bloomington, IN August 23, 2005 Special Education/Portal Meeting Bloomington, IN September 19-21,2005National Forum on Geosciences Information Technology Washington, D.C. October 5-6, 2005 Fall All-Hands Meeting Chicago, IL October 20-21, 2005 LEAD/VGrADS Research Discussion Chapel Hill, NC November 4, 2005 Unidata/NCSA Meeting for Initial Design of Siege Client Atlanta, GA Jan. 29 – Feb. 2, 2006Special Metadata and Vocabulary Meeting Huntsville, AL April 10-11, 2006 LEAD/VGrADS Integration Workshop Chapel Hill, NC April 18-21, 2006 Spring All-Hands Meeting St. Louis, MO May 4-5, 2006 Special Siege Workflow System Meeting Urbana-Champaign, IL June 8-9, 2006 Special Research Discussion Indianapolis, IN June 12, 2006 Special Portal Workflow System Meeting Indianapolis, IN June 13, 2006 Special Strategic Discussion Indianapolis, IN June 13, 2006 Special Monitoring and Workflow Orchestration Bloomington, IN June 15-16, 2006 Unidata Workshop Boulder, CO July 10-14, 2006

    LEAD Weekly Meetings

    FriThuMon

    PI Conference CallIntegration (AccessGrid)Workflow Research

    (AccessGrid)Week 4

    PI Conference CallIntegration (AccessGrid)Meteorology Research & Education (AccessGrid)Week 3

    PI Conference CallIntegration (AccessGrid)Data & LEAD Grid Research

    (AccessGrid)Week 2

    PI Conference CallIntegration (AccessGrid)Portal & HCI Research (AccessGrid)Week 1

    FriThuMon

    PI Conference CallIntegration (AccessGrid)Workflow Research

    (AccessGrid)Week 4

    PI Conference CallIntegration (AccessGrid)Meteorology Research & Education (AccessGrid)Week 3

    PI Conference CallIntegration (AccessGrid)Data & LEAD Grid Research

    (AccessGrid)Week 2

    PI Conference CallIntegration (AccessGrid)Portal & HCI Research (AccessGrid)Week 1

  • 25

    6.4. External Advisory Panel The LEAD External Advisory Panel (EAP) met late in year-1 but not in year-2 owing to the year-2 site visit. The panel, as currently constituted with 11 members, is shown in Table 6.4.1. Membership is rather significantly unbalanced with regard to type of institution represented, with two members from academia, eight from Federal laboratories or FFRDCs, and one from industry. Gender balance also is quite skewed, with nine men and two women on the panel.

    Table 6.4.1. Current composition of the LEAD External Advisory Panel.

    We are in the process of revising the EAP and as of this writing, the new 15-member panel consists of seven persons from academia, six from Federal laboratories/FFRDCs, and two from industry. Among this group we have added a teacher (the Chief e-Learning Office of Chicago Public Schools), a graduate student, a second member from industry, and another university faculty member. The disciplinary balance is roughly the same while diversity is much improved: two of the nine males, and one of the six females, are ethnic minorities. As recommended by the EAP in 2004, we have augmented the membership to include an additional individual in the area of data/data bases. We expect the EAP membership to be finalized by the end of July, 2006, at which time the list will be posted to the LEAD web site.

    7. Year-3 Accomplishments Compared with Plans

    Shown below, in arbitrary order, are the principal goals for year-3 taken directly from the year-2 annual report. We describe in subsequent sections our progress toward meeting these goals and note that the “stretch goal” was not achieved because the SPC and NSSL suspended their joint spring program in 2006 owing to their move to the National Weather Center building.

    • Continue to refine system functional requirements, architecture plans and R&D

    priorities • Move to type-specific data ingest on the LEAD Grid (e.g., Oklahoma ingests only

    NEXRAD radar data) and expand the number and scope of data sets available Complete the conversion of ADAS, WRF, and ADaM to services and expose more functionality

    • Develop fault tolerance in key services such as ADAS, ADaM and WRF

    Member and Affiliation Primary Expertise Ian Foster (ANL) Computer Science

    John Horel (University of Utah) Meteorology Joe Klemp (NCAR) Meteorology

    Mary Marlino (NCAR) Education/Digital Libraries Mike Wright (NCAR) Education/Digital Libraries

    Dave Fulker (UCAR Unidata, retired) Computer Science/Data Peter Cornillon (University of Rhode Island) Oceanography/Data

    Tony Hey (Microsoft) Computer Science Bill Johnston (LBL) Computer Science/Data

    Roberta Johnson (UCAR) Education Sandy MacDonald (NOAA Labs) Meteorology

  • 26

    • Integrate ontology, catalog, MyLEAD, broker, naming and repository services for interoperability

    • Develop tool for writing and reading metadata schema • Automate the generation of input files for key applications such as WRF and ADAS • Refine workflow system including service component interaction and monitoring • Develop first generation dynamic workflow capability • Continue to refine all tools and enhance interoperability • Improve the usability of the portal by implementing terminology that is more intuitive

    for various classes of users and remove unnecessary jargon • Improve the human computer interface aspects of the portal by working with all end

    users, especially educators and students • Be able to execute all key elements of canonical problems 1-3 across all test beds • Make available selected capabilities to user-partner sites, especially the DTC and

    education partners, and conduct evaluations • Integrate FSL’s WRF configuration application, if available, into LEAD • Execute the University of Michigan’s community engagement plan • Complete the study of hazardous weather feature detection using radar-based algorithms

    applied to raw data versus assimilated data sets • Continue work in ensemble Kalman filtering applied to streaming observations • Complete the study to assess the value of dynamic adaptation in numerical analysis and

    prediction (CASA student) • Ingest streaming data from the 4-radar CASA NetRad test bed in Oklahoma and produce

    associated meta data • Conduct initial testing of dynamic instrument steering using the 4-radar CASA NetRad

    test bed in Oklahoma • Develop new education proposals in collaboration with DLESE and other groups • Present results in high quality refereed journals and at major conferences including those

    involving disciplines outside the core LEAD competencies\

    As noted in previous annual reports, research and development within LEAD are organized around five parallel research thrusts (Figure 7.1), along with two cross-cutting components, the latter of which ensure that the former are tightly and continuously integrated. Note that intentional overlap exists among the thrusts and each is led by a PI from two separate institutions (Figure 6.1.1). We present below our results for year-3 and in §24 our plans for year-4.

    Stretch Goal: On the Teragrid, use LEAD to generate daily WRF analyses and forecasts as part of the NOAA Storm Prediction Center & National Severe Storms Laboratory 2006 Spring

    Program.

  • 27

    Figure 7.1. Organization of LEAD research and development. 7.1 LEAD Grid The LEAD Grid group was established to configure, install and operate the LEAD Grid (§4 and Figure 4.1) as a stable research and development environment. The principal goals in year-3 were:

    • Ensure that an optimal testbed environment is available as needed to support all LEAD goals and functionality

    • Refine system requirements and specifications as the project evolves • Continue to develop and refine procedures and policies • Develop a set of lessons learned for the broader community and disseminate them via

    web and journal publications • Work with the GRIDS Center and TeraGrid personnel on strategies for deploying

    selected elements of the LEAD technology on other grids

    In year-3, the LEAD Grid group worked to ensure that an optimal testbed environment was available as needed to support all LEAD goals and functionality, as well as to ensure that the Grid environment is moving toward the longer-term goals of providing a useful, easily deployable and maintainable architecture. During year-3, the LEAD Grid was used not only for research and development but also for a number of live workshops and demonstrations including the initiation of ITB1 during the July, 2006 Unidata Users Workshop (§10). The work of the Grid group has required careful coordination with each of the LEAD thrusts to ensure that hardware, software, and data requirements are understood and the components correctly installed and available in a timely fashion. It also has required substantial coordination with outside partners, such as the GRIDS Center development team, other NMI

    Portal

    Meteorology (Models, Algorithms, Assimilation)

    Tools (Cataloging, Aggregation, Mining, Visualization)

    Orchestration (Monitoring, Allocation, Workflow)

    Data (Cloud, Semantics, Streaming, Management)

    Education and Outreach (E&

    O)

    LEAD

    Grid

    TechnicalIntegration

    End-UserIntegration

    Systems Integration

  • 28

    developers, the TeraGrid and the broader community of users, to support longer-term LEAD goals. The LEAD Grid team continued to refine system requirements and specifications and develop systems administration documentation. Many issues related to security have been addressed, including authentication and authorization and site-specific policies, requirements, and implementations. The team helped develop and then implemented a security architecture in close coordination with the other thrust groups and TeraGrid personnel. A strategy for dealing with firewalls also has been implemented. Installation and configuration documentation and test plans continue to be developed for each system component as they become available and are posted on the LEAD internal web site.

    Tiger teams were established as necessary to address various issues including those related to security and code repository policies. The LEAD code repository, based on CVS, now includes installation notes and configuration information in addition to component code and README files. The CVS repository and Bugzilla bug tracking system, both of which were established in year-1, helped facilitate integration and testing in year-3. Throughout year-3, the LEAD Grid supported all activities in the software development lifecycle, dealt with numerous issues concerning system functionality, software, processes, and documentation capabilities, and successfully hosted prototype demonstrations and workshops.

    Starting in August 2005, the TeraGrid began funding LEAD scientists M. Christie and S. Marru, both at Indiana University, to work on the NMI portal, from which the LEAD portal has been developed. The Portal is being hosted on the LEAD Grid at Indiana. Because LEAD is developing a complex cyberinfrastructure for research, a great many lessons are being learned. They have been documented by the LEAD Grid team and disseminated via the web and presentations at workshops and conferences. This information has been especially valuable for the GRIDS Center and TeraGrid personnel. Other specific accomplishments in year-3 include:

    • Refined LEAD Grid system requirements: Functional, hardware, operating system,

    middleware, data, tools • Ensured that each node on the LEAD Grid had the capability to ingest datasets via

    LDM; installations were verified through the use of decoder output and visualization • Expanded the Unidata testbed to accommodate more than 120 days of data for most

    datasets, with plans to provide 6 months of data on rotating disk in year-4 • Continued to develop the LEAD security model with an emphasis on interoperability • Refined and updated software and hardware configuration documentation • Investigated the use of environment variables to ease integration of system components

    in the heterogeneous LEAD Grid environment • Developed installation procedures, configuration guides, and test procedures for

    software components • Developed and refined system management policies and procedures • Participated in planning and supported cross-site software checkout and test for

    Integrative Testbed activities • Supported planning and cross-site testing for the Unidata Users Workshop (§10).

  • 29

    7.2 Data Thrust

    The data system within LEAD consists of services that provide discovery, access, and search support to data objects during the course of atmospheric sciences research and education. It has adopted a common XML schema for communicating between services regarding data objects and also provides storage services for select data products, most notably those generated during the course of experiments conducted within LEAD. The meteorology community has benefited from a relatively long history of access to a large number of observational and model generated data products, resulting in the early establishment of community-supported data dissemination, access, and visualization tools. These tools, most notably Internet Data Dissemination (IDD), THREDDS, and IDV – all developed by Unidata – serve a broad community of users. The LEAD data system has built upon this strong foundation with research in four areas of extended functionality:

    • Information discovery – the Noesis user-interactive exploration tool, Glossary, and

    IDV visualization tool enable users to explore data, definitions, and terms interactively through the LEAD Science Gateway.

    • Automated cataloging of private data – data products generated by a workflow are

    described automatically through automated metadata generation, at the point of creation, and are stored in a private repository for the user close to where the computations take place. Metadata, which includes domain-science attributes, is stored separately in a database to accommodate complex search criteria. Separation of metadata from data product storage is well suited to large-volume data products that are characteristic of mesoscale meteorology research and education.

    • Approximate search over heterogeneous data collections – a common vocabulary,

    built-in search features in metadata catalogs, and a domain-specific ontology collectively support approximate search – that is, search capable of handling various dimensions of vagueness. Back end support for multiple heterogeneous collections extends search capability beyond a single catalog protocol.

    • Smooth integration path for future services and data collections brought into the

    LEAD data subsystem – built into the design by means of an SOA architecture, a common metadata format, and ‘crosswalks’ as a means to map from other metadata schemas to the LEAD Metadata Schema, the data subsystem easily can be extended to include new services and data collections.

    7.2.1 Components

    As the end of year-3 approaches, the data system now exists on a solid foundation. All components of the architecture have been developed, although in varying stages of maturity. Specifically, the system (Figure 7.2.1) comprises a set of services that can roughly be categorized into storage repositories along the bottom, metadata catalogs and access services across the middle, and a portal layer across the top. The lower and middle layers, as a general rule, execute as persistent services on one or more machines in the LEAD grid. The portal and

  • 30

    portlets along the top execute part of their functionality at a portal server residing on the LEAD grid and part on a level that is downloaded to the client’s desktop or laptop machine. Storage services along the bottom of Figure 7.2.1 include the Globus Toolkit GT4 Data Replica Server/Replica Locator Service (DRS/RLS), the Unidata THREDDS Data Service (TDS), and the THREDDS/OPeNDAP catalog suite. The latter hosts select community data products produced by the Unidata Internet Data Distribution (IDD) system. Data products are stored to an OPeNDAP server and cataloged by a THREDDS XML page. In addition to providing data sub-setting and aggregation via OPeNDAP, the TDS is integrated with the IDV, providing thus full data access and visualization capabilities to data stored there.

    Figure 7.2.1. The LEAD data system architecture.

    The LEAD Data Thrust determined in year-2 that metadata descriptions of data products would be encoded using the same general, extensible XML schema now known as the LEAD Metadata Schema (LMS). In response to finalizing the details of LMS, many initiatives were undertaken to support it. In one, we developed a crosswalk (shown in the lower right

  • 31

    portion of Figure 7.2.1) that maps from the THREDDS XML schema to the LMS. In year-3, the crosswalk was upgraded so it could generate a complete and valid set of minimum LEAD metadata (MLM). The crosswalk also was enhanced to create file level metadata, including URLs for data access. These capabilities allow IDD data to be queried, accessed and imported into the LEAD infrastructure. The DRS/RLS storage repository stores data products that are generated during the course of an experiment. DRS/RLS was selected for two principal reasons: first, it is built on the service stack of the Globus Toolkit GT4 so as to not introduce incompatibilities, and second, DRS has strong support for authorization and authentication. Ensuring the privacy and security of a user’s data products is a primary requirement of the personal workspace. Unfortunately, DRS introduces a protocol conflict because it supports the Reliable File Transfer (RFT) protocol for file transfer. RFT is optimized for terabyte data transfer where interruption is likely; however, RFT has known problems with backward compatibility that make it less suited for the LEAD Grid. LEAD data transfers are of the high volume, smaller size nature and these, gridFTP is sufficient. The THREDDS Data Repository (TDR), which can be considered a storage repository extension to the TDS (Figure 7.2.1), was conceived as a longer term, web-based storage solution and in year-3 its development was initiated. The TDR provides data storage as well as support for metadata generation and maintenance. The combination of TDR and TDS provides a complete data archival and access system. The TDR currently generates THREDDS metadata and work is underway to enhance those metadata and integrate them with existing THREDDS catalogs so that data stored in the TDR can be served by the TDS. The current prototype version of the TDR has both servlet (graphical) and command line interfaces. At the middle layer in the data system (Figure 7.2.1) reside metadata catalogs, along with discovery, search and access tools. This layer primarily is responsible for information discovery, automated cataloging and approximate search. In year-3, the myLEAD metadata catalog and myLEAD agent service migrated to full support for the LMS. A performance evaluation was performed on the myLEAD server and the results will be reported at the highly selective 7th IEEE/ACM International Conference on Grid Computing (Grid'06) (18% acceptance rate.)

    7.2.2 Integration In the year-2 annual report, we noted that year-3 activities would be dominated by integration. As anticipated, a large part of the data thrust resources were so directed. The myLEAD Personal Catalog was fully integrated with the Experiment Builder so that the latter could utilize the user’s personal space for storage and retrieval of experiment elements. Challenges included propagating support for LMS throughout the Experiment Builder and GBPEL Workflow Engine. Automated metadata generation poses a challenge because data products are generated on the fly – thus, sufficient metadata must be generated automatically during workflow execution. This major year-3 accomplishment was demonstrated at the July

  • 32

    2006 Unidata User Workshop (§10). Full metadata generation remains elusive, and research is ongoing. The Resource Catalog is a web service index into community data products that provides a limited X-Query interface for querying same. In year-3, the Resource Catalog underwent a major revision from supporting collections of data objects to supporting the data objects themselves. That is, it expanded from holding collection-level metadata to holding both collection-level and file-level metadata. This step was necessary because the Experiment Builder must be able to query and retrieve individual files, and the files need to be described by LMS. The associated scalability issues are now being evaluated. The Ontology Inference Service (OIS) Ontology and Query Service serve to implement the query approximation and information discovery capabilities in LEAD. In large projects such as LEAD, where heterogeneous data collections are pervasive, search capabilities that rely upon a specific known set of keywords are inadequate. Ontologies provide an important piece of the solution because they contain higher-level abstract concepts and support the corresponding mapping to different control vocabulary terms. In year-3 we refined the ontology to map atmospheric science concepts to the Earth Science, Climate and Forecasting (CF convention)-controlled vocabulary because variables in the datasets are being stored in the resource catalogs as CF terms. The application ontologies describe concepts depending upon both a domain and task, and typically they involve specializations of both. The CF ontology uses the Semantic Web for Earth and Environmental Terminology (SWEET) as the domain ontology. SWEET is designed for Earth system science (http://sweet.jpl.nasa.gov/ontology/) and comprises a collection of ontologies covering thousands of concepts related to the Earth system. CF ontology thus represents a mapping of CF terms to broader concepts defined in the SWEET ontology. In year-3, the OIS was further refined and we created a SOAP-based web service interface to an inference engine built upon the Apache Axis SOAP engine. The inference engine used at the back end is Pellet, an OWL DL “reasoner” based on the tableaux algorithms. The reasoner is pre-loaded with the CF ontology and provides T-Box and A-Box querying capabilities to the ontology. T-Box queries cover specializations, generalizations and equivalence of a concept while A-Box queries search for all satisfying instances of a concept and query for property fillers for an instance. Every search request to the OIS is translated to one or more queries for the reasoner. The OIS interacts with the reasoner through a description logic reasoner interface (DIG), which is a standard for providing access to description-logic reasoning through an HTTP-based interface. The query results are returned to the OIS through this interface. OIS has been designed to allow loosely coupled integration using standard web service protocols with other systems such as the Query Service in the LEAD data system. Figure 7.2.2 shows the interface for and thus interaction among the Portal, Query Service, Ontology Inference Service and LEAD Resource Catalog. A user can search for data based upon spatial and temporal range, product type or data category. User selections are communicated to the Query Service which in turn invokes OIS. OIS uses the CF ontology to find specializations for the data category and related synonyms. All of the inferred CF terms

  • 33

    then are returned to Query Service, which then queries the Resource Catalog to locate the appropriate data collections – the results of which are presented to the user (Figure 7.2.3).

    Figure 7.2.2. The LEAD data search interface.

    Figure 7.2.3: Sample results from a LEAD data query.

  • 34

    7.3. Orchestration Thrust Orchestration, as defined in LEAD (see year-2 annual report), involves the organization, scheduling and sequencing of tasks involved in conducting an experiment. The collection of tasks and their dependencies in an experiment is called a workflow. The process of orchestration (Figure 7.3.1) involves defining, or composing the workflow, as well as its execution, the latter of which is referred to as workflow enactment. The Orchestration Thrust is responsible for creating these capabilities and in LEAD, individual tasks in a workflow include extracting data from live streams or archives, mining these data to identify interesting events, transforming data from one form to another, preprocessing data, performing simulations, mining simulation output, rendering simulation output into animations or still images, and using simulation output to re-direct instruments such as radars (Figure 5.3.4.2). This particular set of tasks initially was performed on the LEAD Grid with its set of services, portal servers, data sources and other capabilities. However, in year-3 we migrated the computational and data intensive elements of our workflows to the TeraGrid (see §10). The LEAD Grid now hosts the persistent services (e.g., Portal) that manage the more dynamic activities conducted on the TeraGrid.

    The Orchestration Thrust has achieved two primary goals in year-3. Specifically,

    1. Two workflow systems (GBPEL and Siege) have been hardened within the broader LEAD infrastructure to the point where they can execute static workflows reliably. This capability was made available to more than 80 users at the Unidata Users Workshop in July, 2006 (§10).

    2. We have begun adapting the LEAD workflow systems to fully dynamic, data-driven

    workflows that are at the heart of the LEAD research mission (§5.3).

    In the subsections that follow we describe each of these accomplishments in detail.

    Figure 7.3.1. Sample LEAD workflow task graph depicting a WRF forecast initialized from a NAM analysis within a region selected by the user. This particular workflow was the core

    capability made available to attendees at the Unidata Users Workshop in July, 2006.

  • 35

    7.3.1 Overview

    In the year-2 annual report we proposed several detailed tasks for the Orchestration Thrust in year-3. It is important to note that LEAD is experimenting with two complementary approaches to workflow system design. The first is based upon the concept of service orchestration using the web service workflow language standard known as BPEL. This workflow is driven from the LEAD portal. The other approach is based upon a hybrid orchestration of ensembles of workflows, with workflow nodes composed of services and direct Unix task management using a desktop client called Siege. Siege is a back-end ensemble broker service that was developed as part of the MEAD (Modeling Environments for Atmospheric Discovery) project and has since been refined. Both systems were evaluated at the July 2006 Unidata Users Workshop (§10) and are described further below. Goal #1. Most meteorological applications supported by LEAD have complex, large and interrelated sets of input parameters that typically (e.g., in WRF and ADAS) are encoded as Fortran name lists. Some services may have several hundred parameters, but most users may wish to modify only one or two dozen for a given experiment. Depending upon the sophistication of the user, different subsets of parameters may need to be modified. Our research challenge is to automate the generation of service code that will transform the input parameter document into the appropriate input file required by the application. By working closely with the Meteorology Thrust we established a set of common parameters for each principal application that need to be user-modifiable. The portal and workflow systems now can automatically convert those parameter choices into the initialization documents needed by each application. These documents are automatically propagated to each of the services that manage associated codes. More work is needed to make this system sufficiently flexible to accommodate new applications introduced by users (e.g., an ocean model, which we propose to bring into LEAD in year-4; see §24.7), and to help users avoid choosing parameters that are incompatible. This latter point is extremely challenging because thousands of combinations are possible. Thus, our efforts are directed toward some of the most important parameters and associated incompatibilities (e.g., relating model time step to grid spacing for linear stability). Goal #2. LEAD workflow instances may be long-running or they may remain idle for long periods of time until activated by a particular trigger (e.g., a storm developing within a radar stream that is being mined by a persistent agent). Depending upon the situation, the workflow can respond to different scenarios. If so desired, the workflow can be programmed to wait for months to respond to a specific notification, or series of notifications, that signal the occurrence of a particular weather event or other specified “trigger.” In this case, if the workflow “knows what to do” it can respond accordingly. Our research challenges include:

    • Integrating BPEL into the web services model used by grid standards. This has been

    accomplished by the fact that the majority of grid standards now are compatible with our BPEL approach.

    • Adapting BPEL to work effectively in an environment based upon WS-Eventing and services that require secure access and authorization. This also has been accomplished

  • 36

    through our Event Notification Service, which uses the WS-Eventing standard and completely interoperates with the Globus GT4 WS-Notification standard. All of our services support authorization through our authorization services as described below.

    Goal #3. In addition to responding to dynamic changes in weather, LEAD workflows also must respond to the dynamic behavior of computational and grid resources in order to meet the requirement of “faster than real time” prediction. For example, as a workflow progresses it must allocate new resources for ensemble runs if conditions warrant. This requires an adaptive scheduler and the ability to monitor the execution of each workflow component so that appropriate adjustments in resources can be made. During year-3 we focused on extending the monitoring infrastructure so as to correlate and analyze monitoring output at the application, workflow and resource levels. To allocate and broker new resources, we are collaborating with the VGrADS project as described in §14.3. Goal #4. We are conducting research to support the launching of large numbers of ensemble model simulations or forecasts, each with large numbers of workflows, onto production resources such as those at the NSF supercomputing centers and the TeraGrid. The NCSA has developed a SOA to manage such workflows and the user interface, known as Siege (see also §7.3.4), is the control panel by which the user launches and monitors complex workflow scenarios within grid resources. Siege interacts with the Troll ensemble broker and execution service stack, which is informed by the Vizier information services, and together they provide many of the details necessary to resolve the execution of complex scientific applications on remote services (see Figure 7.3.4.1). The Troll ensemble broker handles a high-level workflow above the grid resources while workflows local to a particular cluster ar