Post on 19-Dec-2015
EPOS e-Infrastructure
Keith G JefferyNatural Environment Research Council
keith.jeffery@stfc.ac.uk(with Jean-Pierre Vilotte and Alberto Michelini)
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
e-Infrastructure Basics
• GRIDs• Clouds• Web 2.0• SOA (Service-Oriented
Architecture)• Research process– Fourth paradigm (Data
Intensive Scientific Discovery)
• Virtualisation• Autonomicity
• Security, Privacy, Trust• Performance
• Development• Maintenance
• Internet– 1.5 billion fixed connections– Estimated 4 billion mobile connections
• Digital Storage– Estimated 280 billion Gigabytes
• (280 exabytes – 280*10**18)• Expect all to grow ~ 1 order of magnitude in 4
years – and accelerating)
• Users :– Asia 550 million 14% penetration– Europe 350 million 50% penetration– USA 250 million 70% penetration
• Scalability• Trust & security &
privacy• Manageability• Accessability• Useability• Representativity
Last 20 yearsCPU 10**16Storage 10**18Networks 10**4
CONTEXT
The GRIDs Architecture
Knowledge Layer
Information Layer
Computation / Data LayerDat
a to
Kno
wle
dge
Control
The GRIDs Architecture: Layering
Cloud Computing: The Intention• Low cost of entry for
customers• Device and location
independence• Capacity at reasonable cost
(performance, space)• Cloud Operator manages
resource sharing balancing different peak loads
• Scalable as demand rises from user
• Security due to data centralisation and software centralisation
• Sustainable and environmentally friendly – concentrated power
it is a service and the user does not know or care from where, by whom, and how it is provided as long as the SLA (service level agreement) is satisfied
• Features:– creativity, communications,
secure information sharing, collaboration and functionality
• Examples:– Social networking, video-
sharing, wikis, blogs, folksonomies
– Crowdsourcing to gather information / knowledge wisdom?
If you don’t know what Web2.0 is your kids do!
Web 2.0
Bringing it Together: e-,i-,k-infrastructure
serverserver server server
detectors
e-
i-
k- Deduction & induction – human or machine
Physical
Information
Systems
server
Middleware – and as SOKUs (Service-Oriented Knowledge Utilities)
e-
i-
k-
Lower middleware(hides physical heterogeneity)
Upper middleware(hides syntactic heterogeneity)
K- upper middleware(resolves semantic heterogeneity)
K- lower middleware(presents declared semantics)
Research Process: 4th Paradigm
ObservationsContextual metadataPre-processingDigital preservationAvailabilityAnalysisVisualisation
Hypothesis
ExperimentationObservationsContextual metadataPre-processingDigital preservationAvailabilityAnalysisVisualisation
HypothesisCharacterisationSimulation/modellingObservationsContextual metadataPre-processingDigital preservationAvailabilityAnalysisVisualisation
Observational Science
Experimental Science Modelling Science
DATA-INTENSIVE SCIENCE
(Concept from Jim Gray 1944-2007)
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
Related ProjectsEPOS e-infrastructure has to fit in witha) ESFRI Roadmap projects in Environmental Cluster (ENVRI)b) ESFRI roadmap projects in other clusters
a) Physical sciences (STM)b) Astronomy & Astrophysicsc) Economic/social scienced) Arts and humanitiese) PRACE (supercomputing)f) EGI/NGIs (Data and Computing Grid)
c) European INFRA projects (VERCE, EUDAT…)d) National e-infrastructures for e-Research
a) Especially geosciencee) Other international projects (North America, Japan, Pacific Rim,
South America…)
EPOS (ESFRI roadmap)
NERASeismology & Seismic EngineeringETHZ + ORFEUS/KNMI(D. Giardini; T. van Eck)
EPOS PPSolid Earth ESFRI projectINGV (Massimo Cocco)
SHAREHazardETHZ (D. Giardini)
GEMHazard
VERCE Earthquake & Seismology
CNRS-IPGP (J-P Vilotte)UEDINORFEUS/KNMIEMSCINGVLMUUniv LiverpoolBADW-LRZCINECAFraunhofer/SCAIINFRA-2011-1.2.1
EUDATData Infrastructure
CSC Finland (Kimmo Koski)EPOS (GFZ, INGV)LifeWatch…CINECA UEDIN…INFRA-2011-1.2.2
ENVRIEnvironment Research InfrastructureLifeWatch (Wouter Los) EPOS (ORFEUS/KNMI)LifeWatchEPOSEMSOEISCATICOSSTFCUEDIN…INFRA-2011-2.3.3Project proposals 2010
INFRASTR. 2011-1 Call 8/9
EPOS IT relevant EC-project projects + proposal (summary)
EC projects starting 2010
QUEST (Training network)Computational SeismologyLMU (H. Igel)
Under negotiation Under negotiation Under negotiation
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
e-Infrastructure Requirement• Data collection, calibration, validation• Data cataloguing and indexing • Data preservation and curation
• Information processing – retrieval, analysis, visualisation• Hypothesis processing – simulation, modelling, analysis, visualisation• Hypothesis generation – data mining
• Knowledge processing – integration of ICT with human processing – theory processing, user interface, scholarly communication (open access)
• External interoperation – physical and medical sciences, economic and social sciences, arts and humanities
• Dissemination – outreach (website plus)• Education and training• Management and Coordination
Key e-Infrastructure Principles
• Mobile code: ability to move code to data because data large and costly to transport
• Virtualisation: user neither knows nor cares where computing done or where data located as long as QoS/SLA met
• Autonomicity: (self-*) because human management of ICT too expensive / slow
Key e-Infrastructure Challenges
• Interoperation– Access to heterogeneous distributed data sources– Schema integration – syntactic and semantic
• Security/privacy/trust– Identification – authentication – authorisation –
accounting• Performance– Towards exascale processing (simulation/modelling)– Towards exabyte data streams
(1.0*10**18)
Steps to achieve EPOS e-Infrastructure1• Define / Agree requirements of end-user (document
dynamically)– Including expected future requirements
• Survey available data/information sources (document dynamically)– Detector systems– Repositories / databases / file systems– Data, documents, metadata, contextual data– Conditions of use – QoS, SLA (link to governance)
• Define schema mappings, convertors for interoperation (document dynamically)– Canonical interoperation standard?
• Note CERIF (Common European Research Information Format)
Steps to achieve EPOS e-Infrastructure2• Survey available computing and computation resources (document
dynamically)– Detector systems– Data servers– HPC– Conditions of use – QoS, SLA (link to governance)
• Define access and utilisation of ICT (document dynamically)– User identification, authentication, authorisation, accounting
(security, privacy)– Available services– Conditions of use – QoS, SLA (link to governance)
• Design first-cut ICT architecture (document dynamically)– GEANT network– GRIDs (EGI) middleware– Web services software– Web portal(s) user interface
Structure of Presentation
• Who?• EPOS Rationale and approach• e-Infrastructure Basics• Related Projects (Torild van Eck)• Proposed Approach• Conclusion
Conclusion(take-home messages)
• EPOS is a HUGE CHALLENGE• EPOS requires LEADING EDGE ICT to support
LEADING EDGE GEOSCIENCE• EPOS e-Infrastructure is the ‘GLUE’• EPOS is going to be FUN!• EPOS is open to collaboration
*********Prof Keith G Jeffery CEng, CITP, FGS, FBCS, HFICS
keith.jeffery@stfc.ac.uk