Sustainability Issues in National Science Foundation ...digitalpreservation.gov › meetings ›...

Post on 25-Jun-2020

0 views 0 download

Transcript of Sustainability Issues in National Science Foundation ...digitalpreservation.gov › meetings ›...

Sustainability Issues in Digital Preservation

Krishna KantGeorge Mason

UniversityNational Science Foundation

Digital Preservation 2013

Alexandria, VA, July 24, 2013

● Significant environmental footprint.● Storage media life-cycle (mfg, distribution, …)● Housing & access management of media● Processing & IO infrastructure (Data Centers)

Can Preservation be Sustainable?

● Not data, but information● Ideally, knowledge

What do we want to preserve?

K. Kant, Sustainability of Digital Preservation

● Make knowledge derivation more sustainable● Minimize environmental

impact of data centers

● Retain only essential data● Remove duplicate, inessential

data

● Storage vs. reprocessing● Has sustainability tradeoffs

Sustainability Issues

K. Kant, Sustainability of Digital Preservation

● But impact is more than power…● Materials, water,

manufacturing, …● Sustainability perspective

● Energy doesn’t matter, its carbon footprint does

Data Center Impact● Power consumption

rising.● Most of it wasted:

● Power distribution, ● Cooling● Idle Machines

13.2kv

115kv

13.2kv

13.2kv 480V

208V

0.3% loss99.7% efficient

0.5% loss99.5% efficient

1.0% loss99.0% efficient

6% loss94% efficient

~1% loss in switch

gear and conductors

UPS:

2.5MW Generator~180 Gallons/hour

IT LOAD

● 9-10% distribution loss at power source● Lots of earth’s resources used (metals, rare earths, …)

K. Kant, Sustainability of Digital Preservation

Renewable Energy Powered IT?

● Limit energy draw from grid ● Less infrastructure &

losses, but variable supply

● Impact on performance, QoS, SLA, …

● Challenges● Variability at multiple

time scales● Reliability issuesNeed better power adaptability

Cooling Infrastructure

High Temperature Operation

● Chiller-less data centers● Less energy/materials, but

space inefficient

● High temperature operation of comm./computing equipment● Smaller Toutlet – Tinlet ● Deal with occasionally hitting

temp. limits.

Need smarter thermal adaptability

Overdesign

● Huge UPS, Generators, dist. frames, power supplies, fans, …

● Engineered for worst case● Huge waste of power,

materials, …

● Power Supply & VRs● Low utilizations ➔ Low

efficiency

Need Better Power Infra. adaptability

● Overdesign ➔ Rightsizing + smart adaptation● Adaptation to energy/power/thermal/cooling limitations.

● Dynamic adaptation of infrastructure & workloads● Need coordination across compute, network & storage.

Energy Adaptive Computing

Data Growth● Exponential growth in both generation &

retention● Data vs. useful data

● More data ➔ More junk (Less information)● Duplication, Keep just in case, too lazy to purge, …

● But, can we define useful?

● Increased power consumption – 10-40% of total power

● Insatiable drive demand ➔ Life cycle impact● Cumulative impact because of little deletion● Not sustainable!

Sustainability Impact of Data

Data Reduction Opportuninites

K. Kant, Sustainability of Digital Preservation

Administra

tive D

omain

Home,

data

center

, dep

artmen

t, …

Object

Snaps

hot,

File/D

B, …

Object

Snaps

hot,

File/D

B, …

Object

Snaps

hot,

File/D

B, …

Administrative DomainHome, data center, department, …

ObjectSnapshot, File/DB, …

ObjectSnapshot, File/DB, …

ObjectSnapshot, File/DB, …

Administrative Dom

ain

Home, data center, departm

ent, …

Object

Snapshot,

File/DB, … Object

Snapshot,

File/DB, … Object

Snapshot,

File/DB, …

● Within an object● Compression, compressive sampling, delta encoding,

remove bundled VM

● Within and across administrative domain● Deduplication across objects & storage nodes● Cloud based storage/access/deduplication● Only a few copies (collectively) across nodes

● Tradeoffs● Storage vs. data movement vs. processing● Fidelity vs. cost (reduced representations)

● Similar, filtered, derived, …● Cross domain access/privacy/security issues

Data Reduction

● Best Practices● Send link instead of content● Don’t create local copy ● Purge obsolete, defective,

unneeded data

Role of Content Creators

K. Kant, Sustainability of Digital Preservation

● Data is valuable, meta-data is precious!● Designed, not an afterthought!● Strong association with data● Must reflect data quality ● Preservation more crucial than

for data

Thank you!

K. Kant, Sustainability of Digital Preservation

● Physical degradation of media● Will require keeping media healthy for

exponentially increasing data ➔ Unsustainable● Media obsolescence ● HW & SW obsolescence● Increasing amount of materials to

manufacture storage media● Unable to power the media● Loss of meta-data or inadequate meta-data

● Have the data but don’t know how to use it/

Other Sustainability Aspects

K. Kant, Sustainability of Digital Preservation