SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost...

14
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer Center (SDSC) University of California San Diego Presented to: Designing Storage Architectures Meeting September 17-18, 2007

Transcript of SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost...

Page 1: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Disk and Tape Storage Cost Models

Richard Moore & David Minor

San Diego Supercomputer Center (SDSC)University of California San Diego

Presented to: Designing Storage Architectures Meeting

September 17-18, 2007

Page 2: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Objectives & Outline• Realistic cost estimates and projections are critical for

storage users/providers• While much info is available on vendor hardware solutions … • Little info on integrated costs from storage provider perspective

• Estimate costs for at-scale provider to ‘store bits’

• Outline• Caveats• SDSC’s Storage Infrastructure• ‘Bit Storage’ Cost Estimates

• Tape Archival Storage• Disk Storage

• Projections – with scale of storage facility and into the future• Conclusions

Page 3: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Caveats on Cost Estimates• Sustainable storage

• Annual cost w/ media/technology refresh & data migration• Not write-once and put on a shelf

• Based on SDSC experience only• Include UCSD’s indirect costs – will vary by institution• Other providers may have different cost structure

• Based on SATA disk and enterprise-class tape systems• Cannot be specific about vendor costs or burdening, but

relative fractions are reasonable• This is a snapshot as of Jan 2007 - will decline w/ time • Paper focuses only on single-copy ‘bit storage’ costs

‘Bit storage’ is only a fraction of the cost to ‘preserve data’

Page 4: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

A Three-Stage Model for A Digital Preservation Environment

StoreIngest Use

‘Bit Storage’•Capacity

• Online (disk)• Archival (tape)

• Single-copy reliability• Media/technology advances• Data migration

•Replication• Geographically distributed• System diversity

• Verification & recovery• Synchronization

• ‘Master’ version• Propagating to replicas

• Audit trails• Mitigation of termination risk

Page 5: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

SDSC’s Storage Infrastructure

Page 6: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

SDSC’s archive shows exponential growth w/ a consistent doubling period of ~15 months

Page 7: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Cost Elements of Bit Storage Estimates• SDSC’s Cost Estimates Include:

• Annualized capital costs of the media (including disk controllers)• Other annualized capital costs

• Disk: File system servers, SAN gear• Archive: Silos, tape drives, disk cache, file system servers

• Hardware maintenance and software licenses (annual)• Facilities costs – space, utilities (annual)• Labor to maintain & administer systems, migrate data (annual)

• Disk: 3 FTE’s to administer disk storage & SAN• Archive: 3 FTE’s to administer archival systems

• Annual costs normalized by:• Total SATA disk deployed (~1.8 PB SATA)• Current volume of data stored on tape (~5 PB)

• Sustainable rate - $/TB/year• Assumed to be long-term storage w/ migration costs

Page 8: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Disk and tape storage cost elements

• Media cost is not the

dominant cost (36%/20%)

• Additional capital

infrastructure is

required (15%/33%)

• Media + other capital

is ~half the total cost

(51%/53%)

• Labor costs are a

significant cost (23%/20%)

• Facilities costs modest

(11%/5%)

Page 9: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Disk/Tape Storage Cost Comparison: Relative Cost Elements

Page 10: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

How do costs scale with the size of the storage infrastructure?

• Economies of scale are significant as one moves up to “at-scale” installations ($/TB/yr decreases) • Vendor negotiations on media, other capital, maintenance• Fully utilizing servers, infrastructure and personnel

• Once infrastructure is “at-scale”, economies of scale slow down and the cost ($/TB/yr) levels off with installation size• Media, supporting capital, maintenance, facilities costs• Perhaps some weak economies of scale in these factors• Some “linear” costs occur in large quantum steps – e.g. hiring additional

administrator, larger servers to handle load

• A portion of the cost elements (software licenses) are fixed with installation size => decreasing $/TB/yr for these elements

• So with “at-scale” installations, net $/TB/yr will level off and then slowly decline

Page 11: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

How will costs change in the future?• If annual costs decline exponentially with a halving time of t, the cost to

store data in perpetuity is finite (1.44 * t * Current cost/yr)• Expect that exponential declines in media costs and other IT equipment will

continue for a while• Cost ($/TB/yr) will decline, but how much?• Critical issue is which cost elements will scale with the declining media

costs and which will not?• Most costs scale w/ media, but labor & facility costs may not scale well

• Cost elements that do not scale well w/ media will dominate future costs, even at the ‘bit storage’ level• And we expect that for the broader ‘storage’ costs beyond bit storage, e.g. file

management, labor costs will dominate!• New technologies

• MAID for “disk archive”: capital cost comparable to disk, but lower operations costs (utilities, floor space) and extended useful lifetime

• Disruptive storage technologies on horizon

Page 12: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

What about trends in the relative cost of disk/tape storage?

• Historical trends in media costs• Actual purchases over SDSC’s 20-year history indicate tape media

cost/TB declines exponentially with halving time ~3 years• Apples-apples comparisons harder for disk, but halving time is shorter• If these trends continue, expect costs to converge within a few years

• Even as costs converge, there may be good reasons to maintain a few large-scale centralized tape archives• Notion that there’s less risk to a tape cartridge than spinning disk

Page 13: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Comparison with Commercial Services• Many commercial companies are offering web-

accessible storage services• One example - Amazon S3 (aws.amazon.com/s3)

• Cost structure (~April 2007) - $1800/TB/yr storage + upload $100/TB + download $130-180/TB + put/get/list transaction fees

• # of copies and media not specified, but speculate 2+ disk copies• Don’t know the capital/business model• No Guarantees - From AWS License Agreement

“Amazon and its affiliates are not responsible for any unauthorized access to, alteration of, or the deletion, destruction, damage, loss or failure to store any Content or other data which you submit in connection with your account. “

SDSC cost estimates are “in the ballpark” w/ commercial services

Page 14: SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

SAN DIEGO SUPERCOMPUTER CENTER

at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Conclusions• Initial caveat … Bit storage costs are only a fraction of the total cost

for ‘digital preservation’• Ingest and use phases not addressed• Only a portion of storage phase costs included

• SDSC’s sustainable single-copy ‘bit storage’ costs:• ~$500/TB/yr for tape storage• ~$1500/TB/yr for disk storage

• Media costs are ~30% of the integrated ‘bit storage’ costs and total capital is ~50% of costs for both tape and disk

• Costs ($/TB/yr) increase, then flatten out and eventually slowly decline w/ scale of installation

• Costs will decline with time, but critical issue is which elements do not scale w/ media/technology advances

• Disk/tape integrated costs are converging