Storage Systems CSE 598D, Spring 2007 Lecture 1: Introduction and Overview January 25, 2007.

Post on 11-Jan-2016

216 views 0 download

Tags:

Transcript of Storage Systems CSE 598D, Spring 2007 Lecture 1: Introduction and Overview January 25, 2007.

Storage SystemsStorage SystemsCSE 598D, Spring CSE 598D, Spring

20072007

Lecture 1: Introduction and Lecture 1: Introduction and OverviewOverview

January 25, 2007January 25, 2007

How this course will work

• Class meetings twice a week– Tue, Thu: 5.30 - 6.45 pm, 223B

• Lectures by me in most classes• Some student presentations

– To be determined as the course progresses

• Everyone should participate in discussions– Part of your grade for participation!– Scribe notes to record lectures and discussions

• 2-3 assignments– May involve some simple system building: details to be

decided– Some written homeworks

• Online resources– Class URL via my Web page– Slides/scribe notes/assignments on Angel– Please make sure your Angel email works

How this course will work

• Background expected– Operating systems (411-level)

• Basic knowledge of file systems, I/O subsystem, DMA, device drivers, …

– Distributed systems• Consistency semantics, replication, caching,

synchronization, …

– Algorithms and data structures (undergraduate-level)

• Analysis of algorithms, basic data structures

• Will cover background material whenever needed– Your feedback important in deciding what to cover

How this course will work

• No text-book– I will use some chapters from a set of books

• If needed, photocopies of these will be made available to you

• Syllabus consists of material presented in class– Most of it based on research papers made available on the

course page• Not up yet but will be soon• Additional reading material: for background or to delve

deeper

• What you need to do– Read assigned papers BEFORE each class– During the class

• Ask questions, express your opinions, argue!

• Goal: Learn about storage systems– Also learn

• How to read a research paper?• How to write a good systems paper?

– What separates good (systems) research from bad?

How this course will work

• Grading– Scribe notes: 10%

• Detailed notes that one can go back to and find everything that was presented and discussed in the class

– And that you can use for revision before the exam!

– Participation in class: 10%– Mid-term exam: 20%– Presentation: 10%– Assignments (2-3): 20%– Survey or Project: 30%

How this course will work

• Survey– A 10-15 page comprehensive exploration/synthesis

of an area related to storage systems at the end of the semester

• Project– Groups of up to 2 students– Identify a problem and motivate the need to solve it

• Convince where existing research lacks– Develop and evaluate your solution– Present it in a paper-style write-up at the end of the

semester

• Today– Some background/history on storage

systems– Overview of course content

• A superset of topics we will study

Introduction

Why Applications Need Storage

• Memory is – Volatile: Durability is needed– Not enough: High Capacity is needed– Not easy to share/move: Portability is needed– Expensive

• Non-volatile, cheap, long-lasting, reliable, abundant storage is needed for numerous applications– Personal/individual applications– Scientific applications– Enterprise applications– Internet scale applications– Emerging sensor networks, highly distributed

systems such as some P2P systems

Personal Applications

• Email, Contacts, Schedules, …• Financial data, personal files, …• Media files• Gaming

Scientific Applications• Manipulate large data sets: Either explicitly

(files) or implicitly (VM).

Sanger Institute Sequencing facility to add 100 TB each yr.

CERN Particle Collider

NASA EOSDIS

Enterprise Applications

• File and Email servers• OLTP• OLAP• Other Database applications• SAP• Financial workloads• …

Internet Scale Applications

Data Grids

Sensor Networks

IBM 305 RAMAC - 1956Random Access Method of Accounting and

Control

• 5 MB capacity, 50 disks each 24” diameter, 2000 bits/sq-inch density

• First computer with magnetic hard disk– Replaced the “magnetic drum”

• Could store roughly 2000 pages of text!

Seagate Savvio 10K.1 - 2004

• 10K RPM, 73.4 GBytes• Can read and write complete works of

Shakespeare 15 times each second!

Seagate Savvio 15K - 2007

• 15K RPM, 73.4 Gbytes• World’s fastest disk?

Storage Devices/Hardware

RAIDArrays

Storage Area Networks

Tape Archives

Overview of Course Content

Overview of Course

• What goes on inside a disk?– Hardware– Modeling the disk– Performance optimizations

• Disk scheduling• Rearranging data blocks

• How do you improve bandwidth to/from disks?– RAID arrays– Reduce data transferred from disks (Active Disks)– Storage Area Networks to allow concurrent transfers

to/from several hosts• Shared Storage Model

• How can software take advantage of these enhancements– Review of the OS I/O subsystem– How sys-admins manage storage– File Systems for NAS/SAN– Caching and Pre-fetching

• Theory of storage– Which problems are hard?– Important data structures

• With shared storage, and a very complicated storage system, how do we manage this hierarchy?– Storage Provisioning– QoS Control/Virtualization– Security – Case-studies of enterprise storage systems (e.g., EMC, Veritas)

• Requirements are becoming more stringent - we need do guarantee availability,

and store data for a long time (archival storage). How do we achieve this?– Dependability/Availability issues– Disaster management– Data lifetime

• Power and thermal management of storage systems

• Storage in highly distributed systems– Storage in P2P systems– Sensor storage– Grid-like infrastructure based storage: E.g., Oceanstore

• Storage in search, information retrieval– Google File System

• Are disks going to be the norm in the future?– Future of magnetic storage – MEMS– Flash storage

• Windows Vista for laptops

• Part of the material will be from these books– “Storage Networks Explained” (Wiley),

Troppens, Erkens, and Muller– “The Holy Grail of Storage Management” by

Toigo– “Storage Area Network Essentials” (Wiley)

by Barker and Massiglia

Next time• Hard disk• Certain aspects of I/O subsystem

– Spanning hardware and OS

I/O System View

CPU

iL1

dL1

L2

DiskCtrller

I/OBus(e.g. PCI)

Memory Bus(e.g. PC133) Main

Memory

Software StackAppln.File SystemBuffer ManagerDevice Driver Controller(ASIC)

Device FirmwareCacheDMA engine

e.g. SCSI

PlattersActuatorMotorsElectronics