© 2011 IBM Corporation 1 (ENSUREing we can) Ride the Wave (on a Cloud) Presenter: Michael Factor,...

15
© 2011 IBM Corporation 1 (ENSUREing we can) Ride the Wave (on a Cloud) Presenter: Michael Factor, Ph.D. IBM Research – Haifa [email protected]

Transcript of © 2011 IBM Corporation 1 (ENSUREing we can) Ride the Wave (on a Cloud) Presenter: Michael Factor,...

© 2011 IBM Corporation1

(ENSUREing we can)Ride the Wave

(on a Cloud)

Presenter: Michael Factor, Ph.D.

IBM Research – Haifa

[email protected]

© 2011 IBM Corporation2

Riding the Wave – Some Challenges

The amount of data– ESA estimates of 20PB archive in 2020 – Large Hadron Collider will produce roughly

15PB annually– North American Health Care research data

is estimated to surpass 136PB in 2015; EHRsand imaging data will add 9.7EBs

How will we access this data? How will we preserve this data? How will we ensure its integrity,

provenance, context, etc.? How will we ensure data privacy? How will we pay for this?

© 2011 IBM Corporation3

What is a cloud and why is it interesting?

“Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”

– US National Institute of Standards and Technology, Information Technology Laboratory

Key features: On-demand Shared Automated Network access

Benefits Cost Savings

– Economies of scale, utilization improvement and standardization

Speed and Agility Pay-as-you-go for usage

Investment per GB vs. Quantity of Information

© 2011 IBM Corporation4Source:”Cloud will Transform Business as We Know It: The Secret’s in the Source”, Hfs Research, and the London School of Economics, December, 2010

How Much of a Concern are the Following Business Risks Posed by Cloud Business Services to your Business Function, Compared to Your Existing Risks for Non-Cloud Business Services?

Security, privacy, lack of control in data placement, lock-in and compliance are key concerns with cloud

© 2011 IBM Corporation5

Five cloud delivery models

• Fixed price • Fixed price or pay as you go

• Controlled sharing

• Fixed price or pay as you go

• VPN access or public internet

• Elastic scaling through sharing

• Pay as you go

• Public internet

EnterpriseEnterpriseData Center

Private Cloud

EnterpriseData Center

IBM operated

Managed Private Cloud

IBM owned and operated IBM owned

and operated

Hosted Private Cloud

User A

User B

User C

User D

User E

Public Cloud Services

1 2 3 5Enterprise A

Enterprise B

Enterprise C

Shared Cloud Services

4

Private Community Public

• Private network

• Dedicated assets

© 2011 IBM Corporation6

Enabling kNowledge Sustainability, Usability and Recovery for Economic value

34 INNOVATIONS USE CASES

Healthcare

Clinical Studies

Financial Services

EVALUATE Cost and Value

AUTOMATE Preservation Lifecycle

PROTECT Content-aware data protection

SCALE using ICT innovations

A 3-year IP project started Feb 2011www.ensure-fp7.eu

© 2011 IBM Corporation7

ENSURE Healthcare Use Case

Images and diagnosis

Configure system

Transformationand fixity

Lifecycle management

Data protection

Search and extraction

Audit

© 2011 IBM Corporation8

A Configurator optimizes a Preservation Solution taking into acount cost/risk, economic performance and quality

Cost/riskEngine

Economic Performance Engine

QualityEngine

Preservation Plan Optimizer

Configurator

•Runs before initial deployment and re-executes periodically•Reacts to changes in the environment and requirements•Enables deploying a solution that optimies cost/benefit

© 2011 IBM Corporation9

Benefits and issues in using a cloud model for digital preservation

The Benefits of Clouds:• Scalable in number of

objects and size of data• Pay-as-you-go• Sharing across geographic

domains – available anywhere

Cloud Technology Requirements

– Multi-cloud support (export, replication)

– Programmatic visibility (SLAs, events)

– Computation near data– Integration with lifecycle

management– Virtual Appliance (VA)

management

Cloud Security and Integrity Requirements

– Support for object provenance, certification, auditing, …

– Advanced integrity services

– Trust over time– Changes over time

9

© 2011 IBM Corporation10

Preservation DataStores in the Cloud building on work from CASPAR is mapping OAIS to a cloud

Map OAIS AIP and the links among AIPs to the cloud data model Manage object’s inter-relationship and referential integrity Offload computation to the cloud

© 2011 IBM Corporation11 11

ENSURE’s Content Aware Data Protection component will address issues of IP and Privacy in the context of information which must be accessed over decades.

Allow only authorized users to access data even if user identities change

Reflect changes in what constitutes personally identifiable information or intellectual property to ensure compliance over time

Ensure that de-identification technologies used are appropriate at the time of access

Use existing, standard mechanisms that decouple security management from policy and extend with changes over time

DiCOM tag (0010,1040) [Patient’s Address]

Administrative Data

Policy: Researchers must not be able to access administrative data

© 2011 IBM Corporation12

Riding the Wave – A Recap on the Challenges

The amount of data How will we preserve this data? How will we ensure its integrity,

provenance, context, etc.? How will we ensure data privacy? How will we pay for this?

ENSURE and other EU FP7 projects are addressing these challenges– Cloud delivery models hold for much promise to enable meeting these objectives but

much work remains to be done

© 2011 IBM Corporation13

© 2011 IBM Corporation14

Storage Service N

ENSURE Architecture

Storage Service 1Storage Service 2

Info

rmatio

n

Prep

aration

Ingest

Access

ENSURE Preservation Runtime

Preservation Digital Asset Lifecycle Mgmt

Content-Aware Long-Term Data Protection

Preservation Runtime Infrastructure

Preservation-aware Storage Services

Use

r

System

Ru

ntim

e

ConfigurationSelection

Adm

inistrator

Requirements Co

nfig

uratio

n L

ayerConfigurator

Preservation Plan Optimizer

Cost/Risk Evaluation

Economic Performance Evaluation

Quality Evaluation

Preservation Plan

© 2011 IBM Corporation15

Goal– Architect and implement an infrastructure for the reliable and effective

delivery of data-intensive storage services, facilitating the convergence of ICT, media and telecommunications

Innovations– Raise Abstraction Level of Storage: objects with user- and system-

defined metadata– Computational Storage: technology for specifying/executing

computations close to storage– Content-Centric Storage: facilitate access to data by content and its

relationships– Advanced Capabilities for Cloud-based Storage: support delivery of

data-intensive services securely, at the desired QoS, at competitive costs

– Data Mobility and Federation: enable comprehensive data migration and interoperability across remote locations

Facts:– A 3-year project, started Oct 2010– www.visioncloud.eu

VISION Cloud: Virtualized Storage Services Foundation for the Future Internet