Storage Systems CSE 598D, Spring 2007 Lecture 1: Introduction and Overview January 25, 2007.
-
Upload
phillip-rodney-thornton -
Category
Documents
-
view
216 -
download
0
Transcript of Storage Systems CSE 598D, Spring 2007 Lecture 1: Introduction and Overview January 25, 2007.
Storage SystemsStorage SystemsCSE 598D, Spring CSE 598D, Spring
20072007
Lecture 1: Introduction and Lecture 1: Introduction and OverviewOverview
January 25, 2007January 25, 2007
How this course will work
• Class meetings twice a week– Tue, Thu: 5.30 - 6.45 pm, 223B
• Lectures by me in most classes• Some student presentations
– To be determined as the course progresses
• Everyone should participate in discussions– Part of your grade for participation!– Scribe notes to record lectures and discussions
• 2-3 assignments– May involve some simple system building: details to be
decided– Some written homeworks
• Online resources– Class URL via my Web page– Slides/scribe notes/assignments on Angel– Please make sure your Angel email works
How this course will work
• Background expected– Operating systems (411-level)
• Basic knowledge of file systems, I/O subsystem, DMA, device drivers, …
– Distributed systems• Consistency semantics, replication, caching,
synchronization, …
– Algorithms and data structures (undergraduate-level)
• Analysis of algorithms, basic data structures
• Will cover background material whenever needed– Your feedback important in deciding what to cover
How this course will work
• No text-book– I will use some chapters from a set of books
• If needed, photocopies of these will be made available to you
• Syllabus consists of material presented in class– Most of it based on research papers made available on the
course page• Not up yet but will be soon• Additional reading material: for background or to delve
deeper
• What you need to do– Read assigned papers BEFORE each class– During the class
• Ask questions, express your opinions, argue!
• Goal: Learn about storage systems– Also learn
• How to read a research paper?• How to write a good systems paper?
– What separates good (systems) research from bad?
How this course will work
• Grading– Scribe notes: 10%
• Detailed notes that one can go back to and find everything that was presented and discussed in the class
– And that you can use for revision before the exam!
– Participation in class: 10%– Mid-term exam: 20%– Presentation: 10%– Assignments (2-3): 20%– Survey or Project: 30%
How this course will work
• Survey– A 10-15 page comprehensive exploration/synthesis
of an area related to storage systems at the end of the semester
• Project– Groups of up to 2 students– Identify a problem and motivate the need to solve it
• Convince where existing research lacks– Develop and evaluate your solution– Present it in a paper-style write-up at the end of the
semester
• Today– Some background/history on storage
systems– Overview of course content
• A superset of topics we will study
Introduction
Why Applications Need Storage
• Memory is – Volatile: Durability is needed– Not enough: High Capacity is needed– Not easy to share/move: Portability is needed– Expensive
• Non-volatile, cheap, long-lasting, reliable, abundant storage is needed for numerous applications– Personal/individual applications– Scientific applications– Enterprise applications– Internet scale applications– Emerging sensor networks, highly distributed
systems such as some P2P systems
Personal Applications
• Email, Contacts, Schedules, …• Financial data, personal files, …• Media files• Gaming
Scientific Applications• Manipulate large data sets: Either explicitly
(files) or implicitly (VM).
Sanger Institute Sequencing facility to add 100 TB each yr.
CERN Particle Collider
NASA EOSDIS
Enterprise Applications
• File and Email servers• OLTP• OLAP• Other Database applications• SAP• Financial workloads• …
Internet Scale Applications
Data Grids
Sensor Networks
IBM 305 RAMAC - 1956Random Access Method of Accounting and
Control
• 5 MB capacity, 50 disks each 24” diameter, 2000 bits/sq-inch density
• First computer with magnetic hard disk– Replaced the “magnetic drum”
• Could store roughly 2000 pages of text!
Seagate Savvio 10K.1 - 2004
• 10K RPM, 73.4 GBytes• Can read and write complete works of
Shakespeare 15 times each second!
Seagate Savvio 15K - 2007
• 15K RPM, 73.4 Gbytes• World’s fastest disk?
Storage Devices/Hardware
RAIDArrays
Storage Area Networks
Tape Archives
Overview of Course Content
Overview of Course
• What goes on inside a disk?– Hardware– Modeling the disk– Performance optimizations
• Disk scheduling• Rearranging data blocks
• How do you improve bandwidth to/from disks?– RAID arrays– Reduce data transferred from disks (Active Disks)– Storage Area Networks to allow concurrent transfers
to/from several hosts• Shared Storage Model
• How can software take advantage of these enhancements– Review of the OS I/O subsystem– How sys-admins manage storage– File Systems for NAS/SAN– Caching and Pre-fetching
• Theory of storage– Which problems are hard?– Important data structures
• With shared storage, and a very complicated storage system, how do we manage this hierarchy?– Storage Provisioning– QoS Control/Virtualization– Security – Case-studies of enterprise storage systems (e.g., EMC, Veritas)
• Requirements are becoming more stringent - we need do guarantee availability,
and store data for a long time (archival storage). How do we achieve this?– Dependability/Availability issues– Disaster management– Data lifetime
• Power and thermal management of storage systems
• Storage in highly distributed systems– Storage in P2P systems– Sensor storage– Grid-like infrastructure based storage: E.g., Oceanstore
• Storage in search, information retrieval– Google File System
• Are disks going to be the norm in the future?– Future of magnetic storage – MEMS– Flash storage
• Windows Vista for laptops
• Part of the material will be from these books– “Storage Networks Explained” (Wiley),
Troppens, Erkens, and Muller– “The Holy Grail of Storage Management” by
Toigo– “Storage Area Network Essentials” (Wiley)
by Barker and Massiglia
Next time• Hard disk• Certain aspects of I/O subsystem
– Spanning hardware and OS
I/O System View
CPU
iL1
dL1
L2
DiskCtrller
I/OBus(e.g. PCI)
Memory Bus(e.g. PC133) Main
Memory
Software StackAppln.File SystemBuffer ManagerDevice Driver Controller(ASIC)
Device FirmwareCacheDMA engine
e.g. SCSI
PlattersActuatorMotorsElectronics